LINQ Goodness: Google Charts Edition

My day job (one of them, anyway) is to design*, run and maintain Flying Shakes

If someone had told me when I started  that 90% of the code (and 87.653% of all stats are made up, but you get my drift) I’d write would be for the administration side of things, I’d never have believed it.

 

Anyway, to cut a long story short, it was in this context that I came across a fascinating article from the Association for Computing Machinery (and no, I have not heard of them before either). I came across this a month or so ago, but lost the link.

 

With a little bit of Google-fu, I’ve found it once again: The World According to LINQ.

While its a fascinating article that appeals to the Computer Scientist in me (supposedly useless classes on in-depth database theory tend to do that), what caught my eye was the code sample right at the bottom for generating Google Chart Url’s.

That sample is going to come in very handy for me and I thought I’s share it with you.

Go ahead and read the article.

 

*If you see me ranting on Twitter or Google+ about CSS, this is probably why.

BUILD Keynote–App Approval

Almost in passing, though it received big applause, Microsoft announced that the Windows App store will make its technical compliance tools available to app developers so that they can run them themselves and see the output.

This might not seem to be a big deal, but its a shot across the bow of the Apple App Store.  Apple’s apps store has had a terrible time over the years as high-profile Apple developers angst has come to the fore over the labyrinthine and mystical process of App Approval.

 

Microsoft are determined to do things differently. Obviously corporate prestige motivates Microsoft to keep some form of control on what ends up in the app store- no company wants to have PR disasters featured in their stores. On the other hand, Microsoft wants to move to Windows 8 and take their legions of Windows developers with them to the the new metro Style apps. So, by de-mystifying he approval process, Microsoft have removed another stumbling block to developers selling applications through the Microsoft App Store. Microsoft can have its cake and eat it too.

 

appapproval

(Pic from the BUILD keynote 1)

I know I’m feeling more optimistic about the approval process.

One of the things that makes Google+ so great

I’ve been thinking about this for a few weeks, and never really got a handle on how to articulate it.

Till today.

 

I was commenting on Scoble’s post. He says in the post  that importing tweets into Google+ is a very bad idea.

Now, apart from being very pleased that me and him see eye to eye on this, I commented:

Agree with you +Robert Scoble No tweets in here.
Keeping Google + free of imported stuff has encouraged/forced people to post original material rather than reuse from twitter, facebook, friendfeed, flickr, etc.
Its one of those things that make Google+ interesting and different from all the other social networks around. Its made google + a destination in and of itself, rather than simply a portal or an aggregator (like Friendfeed is)

 

This is part of, shall we say, a philosophy around which Google+ is built. A philosophy which, dare i say it, is socially engineering us.

 

For example, the fact that comments and posts are not limited to 140 characters and allow rich formatting actually encourages people to comment. And not just comment – to comment substantially.  That is why Google+ has such a great reputation (already) for interaction.

 

Google has taken a very different approach to the other social networks. It is attempting to fulfil a very different version of what a social network should be.

And so far, its succeeding.

WCF Chat Update: Long Polling

Updating WCF chat continues slowly but surely. I have not made any commits yet, so you’ll have to wait to see the changes.

In addition to the changes to Authentication, there are changes to the callback mechanism that I originally wrote.

 

When i originally wrote the chat application, the callback was one of the first pieces of code that i wrote. The fact of the matter is, the the only requirement for it was to work across the local network (or even simply between instances on the local machine). So when I wrote the Cloud version, suddenly callbacks had to work across NAT to let the application function across the internet.

 

Now, there are a number of possible design patterns that would allow the server to execute a callback on a remote client. 

 

The first is ,indeed, the design pattern we use at the moment. Where we actually have a callback. the enabler for this is actually found in WCF. The wsHTTPDuplexBinding allows for dual HTTP connections – one in each direction. This allows you to invoke an operation on the client. However, in order for this to work, you need to have an instance of WCF server on a per-instance basis. So. Every new client session will spawn a new instance of the server. This means that you are going to end up with dozens ( or hundreds etc) of long-lived instances. The question here is scalability. Is this scalable?

It might seem somewhat arrogant to talk about scalability, but if you design with scalability in mind, you’re not going to end up dealing with it later.

 

The other is something that, while not new, really hit the big-time after Friendfeed released its Tornado server. Tornado supports long polling http connections. Long polling is not in and of itself new. The basic design pattern involves a client making a call ( http or otherwise) to a server. The server receives the connection and keeps it open until it has something to return. Some long polled connections eventually time out and this is the implicit signal for the client to immediately open another connection. Others, such as the Friendfeed implementation, keep them open indefinitely.

 

There are probably more that you can think of, but these are the two that I considered for the Chat Application. As with most things, the choice is between a Push model (where notifications are pushed to the client) and the Pull model (where information, Notifications or otherwise, is pulled from the server by the client).

 

The fact of the matter is that I like both. Both are Cool. And both are supported intrinsically by WCF – no coding voodoo to make thins work.

Of the two, I’ve begun implementing the long polling method. Although it s radical departure, it will allow the overall design of to server-side to remain the same. The WCF server remains as a Single Instance service, and so the implementation remains the same.

The WCF stack is written in such a way that when you mark a OperationContract as needing the Async Pattern, WCF with start the Asyc operation and then go off and handle another request until that method returns. The End method that receives the results then returns the data to the client. In other words, its non-blocking.

I’ve not sorted out the exact specifics of implementation, but there will be changes to both the server and the client to accommodate  this. In saying that, I’m doing  a lot of simplification to the class structure. So hopefully what emerges from all these changes will be better than the current setup. Even I had to go back and follow the inheritance tree to figure things out.

These changes are happening in parallel with the changes to the authentication scheme.

 

So, while I’ve got no code, I leave you with this MSDN blog post on Async programming with WCF and this post that adapts it to long-polling specifically.

Some Interesting Code – your thoughts required

Without going to into a long story, I found some interesting code here to convert anonymous types to any strongly typed, well, type.

 

public static object ToType<T>(this object obj, T type)
{

    //create instance of T type object:
    var tmp = Activator.CreateInstance(Type.GetType(type.ToString())); 

    //loop through the properties of the object you want to covert:          
    foreach (PropertyInfo pi in obj.GetType().GetProperties()
    {
      try 
      {   

        //get the value of property and try 
        //to assign it to the property of T type object:
        tmp.GetType().GetProperty(pi.Name).SetValue(tmp, 
                                  pi.GetValue(obj, null), null)
      }
      catch { }
     }  

   //return the T type object:         
   return tmp; 
}

From this codeproject article.

Anyone have any thoughts on this?  Is it good? Bad? Inefficient? Crap??

WCF Server Chat Update:

I just left this reply to a comment on my last post on this project:

Hi there,

No, I Haven’t been able to make in any changes sine my last comit.

However, I have been taking a look at it over the past week, since there are issues with it. As you say authentication and authorisation are one of them.

I’ve been looking at the possibility of using Forms authentication. This brings ASP.Net Membership, Profiles and Roles to the table and these can be used.

This of course requires a SQL server as a back end. And requires the use of SSL.

This requires significant changes to the code base, to move from the current storage model (XML in the case of the Windows service, and Azure Tables in the case of the Windows Azure role). And since authorisation and authentication are now handled separately, those WCF method calls that currently handle this aren’t required.

Also, since the codebase is effectively two separate projects, these changes need to be made twice.

The client will also need to be changed.

These changes do make a lot of sense and I’m well along in implementing them. So look out for a post soon on them.

 

I thought I’d let everyone know that this is the direction I’m taking things in.

 

Life is busy, which why I haven’t been able to update things as much as I’d like.

There is one other particular problem with the WCF Server chat that I would have liked to have solved in my last comitt.

The issue involves the server pinging other clients across NAT. Of course, this could be solved by moving to a pull model rather than a push notification model. While that would be easy, I want to take a good long look at getting push to work properly before trying any other models.

Push notification is where its at. So if anyone has any pointers to implementing it, please send it my way.

The Dell XPS 8300

If you follow me on Twitter, you’ll know that my ancient Pentium 4 powered desktop died on Saturday. I had powered it off and unplugged from the socket while I was away in holiday. So it died a peaceful death in its sleep. Smile My it rest in peace forever more….

 

So I had to go off to Dell and spec a replacement PC.

I settled on an XPS 8300 with a nice new Intel i7-2600. Its the latest Sandy Bridge CPU, 4 cores clocked at 3.40Ghz.

Base
Intel® Core™ i7-2600 Processor (3.40GHz, 8MB)

Memory
8192MB Dual Channel DDR3 1333MHz [2×4096] Memory

Video Card
Graphics : 1GB AMD Radeon HD 6670

Hard Drive
1.5TB (7,200rpm) Serial ATA Hard Drive

Microsoft Operating System
English Genuine Windows®7 Professional SP1 (64 BIT)

Sound Cards
Sound : Integrated 7.1 with THX® TruStudio

 

I’m quite looking forward to playing with this once it arrives. All that power…..

Once again I’ll be able to run Visual Studio on the desktop. As well as  Virtual PC and Virtualbox.

Not to mention the fact that Flight Simulator is going to rock on this machine Smile

Expect a review in about 2 weeks  Smile

Windows Azure Block Blobs

In Windows Azure Blob Storage, not all blobs are created equal. Windows Azure has the notion of Page Blobs and Block Blobs.  Each of these distinct blob types aim to solve a slightly different problem, and its important to understand the difference.

To Quote the documentation:

  • Block blobs, which are optimized for streaming.
  • Page blobs, which are optimized for random read/write operations and provide the ability to write to a range of bytes in a blob.

About Block Blobs

Block blobs are comprised of blocks, each of which is identified by a block ID. You create or modify a block blob by uploading a set of blocks and committing them by their block IDs. If you are uploading a block blob that is no more than 64 MB in size, you can also upload it in its entirety with a single Put Blob operation.

When you upload a block to Windows Azure using the Put Block operation, it is associated with the specified block blob, but it does not become part of the blob until you call the Put Block Listoperation and include the block’s ID. The block remains in an uncommitted state until it is specifically committed. Writing to a block blob is thus always a two-step process.

Each block can be a maximum of 4 MB in size. The maximum size for a block blob in version 2009-09-19 is 200 GB, or up to 50,000 blocks.

About Page Blobs

Page blobs are a collection of pages. A page is a range of data that is identified by its offset from the start of the blob.

To create a page blob, you initialize the page blob by calling Put Blob and specifying its maximum size. To add content to or update a page blob, you call the Put Page operation to modify a page or range of pages by specifying an offset and range. All pages must align 512-byte page boundaries.

Unlike writes to block blobs, writes to page blobs happen in-place and are immediately committed to the blob.

The maximum size for a page blob is 1 TB. A page written to a page blob may be up to 1 TB in size.

So, before we determine what blob type we’re going to use, we need to determine what we’re using this particular blob for in the first place.

You’ll notice the above extract is quite clear what to use block blobs for: streaming video. In other words, anything that we don’t need random I/O access to. On the other hand page blobs have a 512-byte page boundary that makes it perfect for random I/O access.

And yes, its conceivably possible for you to need to host stuff such as streaming video as a page blob. When you think about this stuff to much, you end up imagining situations where that might be possible.  So, these would be situations where you are directly editing or reading very select potions of a file. If you’re editing video, who wants to read in an entire 4MB for one frame of video? You might laugh at the idea of actually needing to do this, but that the Rough Cut Editor is web based and works primarily with web-based files. If you had to run that using Blob storage as a backend you’d need to use page blobs to fully realise the RCE’s functionality.

So, enough day-dreaming. Time to move on.

Some groundwork

Now, in our block blob, each individual block can be a maximum of 4MB in size. Assuming we’re doing streaming video, 4MB is not going to cut it.

The Azure API provides the CloudBlockBlob class with several helper methods for managing our blocks. The methods we are interested in are:

  • PutBlock()
  • PutBlockList()

The PutBlock method takes a base-64 encoded string for the Block ID, a stream object with the binary data for the block and a (optional) MD5 hash of the contents. Its important to note that the ID string MUST be base-64 encoded or else Windows Azure will not accept the block. For the MD5 hash, you can simply pass in null.  This method should be called for each and every block that makes up your data stream.

The PutBlockList  is the final  method that needs to be called. It takes a List<string>  containing every ID of every block that you want to be part of this blob. By calling this methods it commits all the blocks contained in the list. This means, then, that you could land up in a situation where you’ve called PutBlock but not included the ID when you called PutBlockList. You then end up with an incomplete and corrupted file. You have a week to commit uploaded blocks. So all is not lost if you know which blocks are missing. You simply call PutBlockList with the IDs of the missing blocks.

There are a number of reasons why this is a smart approach.  Normally, I fall on the side of developer independence, the dev being free to do things as he likes without being hemmed in. In this case, by being forced to upload data in small chuncks, we realise a number of practical benefits. The big one being recovery from bad uploads – customers hate having to re-start gigabyte sized uploads from scratch.

Here be Dragons

The following example probably isn’t the best. I’m pretty sure someone will refactor and post a better algorithm.

Now there are a couple of things to note here.  One bring that I want to illustrate what happens at a lower level of abstraction that we usually work at, so that means no StreamReaders – We’ll read the underlying bytes directly.

Secondly, not all Streams have the same capability. Its perfectly possible to come across a Stream object where you can’t seek. Or determine the length of the stream. So this is written to handle any data stream you can throw at it.

With that out of the way, lets start with some Windows Azure setup code.

StorageCredentialsAccountAndKey key = new StorageCredentialsAccountAndKey(AccountName, Account Key);
CloudStorageAccount acc = new CloudStorageAccount(key, true);

CloudBlobClient blobclient = acc.CreateCloudBlobClient();
CloudBlobContainer Videocontainer = blobclient.GetContainerReference("videos");
Videocontainer.CreateIfNotExist();

CloudBlockBlob blob = Videocontainer.GetBlockBlobReference("myblockblob");

Note how we’re using the CloudBlockBlob rather than the CloudBlob class.

In this example we’ll need our data to be read into a byte array right from the start. While I’m using data from a file here, the actual source doesn’t matter.

byte[] data = File.ReadAllBytes("videopath.mp4");

Now, to move data from our byte array into individual blocks, we need a few variables to help us.

            int id = 0;
            int byteslength = data.Length;
            int bytesread = 0;
            int index = 0;
            List blocklist = new List();
  • Id will store a sequential number indicating the ID of the block
  • byteslength is the length, in bytes of our byte array
  • bytesread keeps a running total of how many bytes we’ve already read and uploaded
  • index is a copy for bytes read and used to do some interim calculations in the body of the loop (probably will end up refactoring it out anyway)
  • blocklist holds all our base-64 encoded block id’s

Now, on to the body of the algorthim. We’re using a do loop here since this loop will always run at least once (assuming, for the sake of example, that all files are larger than our 1MB block boundary)

do
            {
                byte[] buffer = new byte[1048576];
                int limit = index + 1048576;
                for (int loops = 0; index < limit; index++)
                {
                    buffer[loops] = data[index];
                    loops++;
                }

The idea (that of using a do loop) here being to loop over our data array until less than 1MB remains.

Note how we’re using a separate byte array to copy data into. This the block data that we’ll pass to PutBlock. Since we’re not using StreamReaders, we have to do the copy byte for byte as we go along.

It is this bit of code would be abstracted away were we using StreamReaders (or, more properly for this application, BinaryReaders)

Now, this is the important bit:

                 bytesread = index;
                string blockIdBase64 = Convert.ToBase64String(System.BitConverter.GetBytes(id)); //1

                blob.PutBlock(blockIdBase64, new MemoryStream(buffer, true), null); //2

                blocklist.Add(blockIdBase64);
                id++;
            } while (byteslength - bytesread > 1048576);

There are three things to note in the above code. Firstly, we’re taking the block ID and base-64 encoding it properly.

And secondly, note the call to PutBlock. We’re wrapped the second byte array containing just our block data as a MemoryStream object (since that’s what the PutBlock methods expects) and we’ve passed in null rather than an MD5 hash of our block data.

Finally, note how we add the block id to our blocklist variable. This will ensure that the call to PutBlockList will include the ID’s of all of our uploaded blocks.

So, by the time this do loops finally exits, we should be in a position to upload our final block. This final block will almost certainly be less than 1MB in size (barring the usual edge case caveats). Since this final block is less than 1MB, our code will need a final change to cope with it.

            int final = byteslength - bytesread;
            byte[] finalbuffer = new byte[final];
            for (int loops = 0; index < byteslength; index++)
            {
                finalbuffer[loops] = data[index];
                loops++;
            }
            string blockId = Convert.ToBase64String(System.BitConverter.GetBytes(id));
            blob.PutBlock(blockId, new MemoryStream(finalbuffer, true), null);
            blocklist.Add(blockId);

Finally, we make our call to PutBlockList, passing in our List array (in this example, the “blocklist” variable).

blob.PutBlockList(blocklist);

All our blocks are now committed. If you have the latest Windows Azure SDK (and I assume you do), the Server Explorer should allow you to see all your blobs and get their direct URL’s.  You can downloaded the blob directly in the Server Explorer, or copy and paste the URL into your browser of choice.

Wrap up

Basically, what we’ve covered in this example is a quick way of breaking down any binary data stream into individual blocks conforming to Windows Azure Blob storage requirements, and uploading those blocks to Windows Azure. The neat thing here is that by using this method not only does the MD5 hash let Windows Azure check data integrity for you, but block ID’s let Windows Azure take care of putting the data back together in the correct sequence.

Now when I refactor this code for actual production, a couple of things are going to be different. I’ll do the MD5 hash. I’ll upload blocks in parallel to take maximum advantage of upload bandwidth (this being the UK, there not much upload bandwidth, but I’ll take all I can get). And obviously, I’ll use the full capability of Stream readers to do the dirty work for me.

Heres the full code:

StorageCredentialsAccountAndKey key = new StorageCredentialsAccountAndKey(AccountName, Account Key);
CloudStorageAccount acc = new CloudStorageAccount(key, true);

CloudBlobClient blobclient = acc.CreateCloudBlobClient();
CloudBlobContainer Videocontainer = blobclient.GetContainerReference("videos");
Videocontainer.CreateIfNotExist();

CloudBlockBlob blob = Videocontainer.GetBlockBlobReference("myblockblob");

byte[] data = File.ReadAllBytes("videopath.mp4");

int id = 0;
int byteslength = data.Length;
int bytesread = 0;
int index = 0;
List blocklist = new List();

do
            {
                byte[] buffer = new byte[1048576];
                int limit = index + 1048576;
                for (int loops = 0; index < limit; index++)
                {
                    buffer[loops] = data[index];
                    loops++;
                }
                bytesread = index;
                string blockIdBase64 = Convert.ToBase64String(System.BitConverter.GetBytes(id));

                blob.PutBlock(blockIdBase64, new MemoryStream(buffer, true), null);

                blocklist.Add(blockIdBase64);
                id++;
            } while (byteslength - bytesread > 1048576);

            int final = byteslength - bytesread;
            byte[] finalbuffer = new byte[final];
            for (int loops = 0; index < byteslength; index++)
            {
                finalbuffer[loops] = data[index];
                loops++;
            }
            string blockId = Convert.ToBase64String(System.BitConverter.GetBytes(id));
            blob.PutBlock(blockId, new MemoryStream(finalbuffer, true), null);
            blocklist.Add(blockId);
            blob.PutBlockList(blocklist);

Holiday Reading iList

While I don’t usually do this before going on holiday, this time I’m not taking any dead tree books with me at all.

Rather, I’m taking my trusty iPad with IBooks and Kindle for iPad installed. Since we’re flying Ryanair, with their stickiness for baggage weights and sizes, he weight saved has been substantial. Usually I take a couple of paperbacks and a hardcover or two, so my bags a lot lighter this time around.

So, that reading list again, split up between iBooks and Kindle.

iBooks:

1. The Void Trilogy – Peter F Hamilton
2. Pandora’s Star – Peter F Hamilton
3. Judas Unchained – Peter F Hamilton
4. Servants of the People – Andrew Rawnsley
5. Life and Death of the Party – Andrew Rawnsley
6. Red November – W. Craig Reed
7. The Hobbit – J.R.R Tolkien
8. Paypal API’s Up and Running – Micheal Balderas

Kindle

1. The Design of Everyday Things – Don Norman
2. Dreaming in Code – Scott Rosenburg

I think that’ll keep me busy for a week 🙂

As you can see the above list is heavily biased towards iBooks. The ability to buy a book off iBooks without even thinking about it is the probable reason. Amazon Kindle gives you too much pause for thought.

About the Paypal API book. Yes, I’m sad. I do have the tendency to program while on holiday. If you’ll recall, I did some major re-architecting of my Client Server Chat project while is was in Spain in December. So goodness know what I might do this time around.

I hear the Design of Everyday Things is a seminal work and every designer should read it. Jeff Atwood of the Coding Horror blog (and Stackoverflow, StackExchange etc) highly recommends it.

Dreaming in Code is the most readable book about programmers and programming I’ve ever read. Though I must say reading it elicits the same reaction as watching Dennis Nedry screw Jurassic Park’s computer systems up: A Long Loud Cry of NOOOOOOOOOOOOOOOOOOOO!

Sightly changing the subject away from books, I got the Camera Connection Kit for my iPad. It’s works like a swiss car. If I could get Visual Studio on my IPad, I’d leave the laptop at home.

I’ll see you all in a week (and a bit, one always needs a holiday from holiday when you get back)

From the Department of MVC useful goodies (and sponsored by the department of Stackoverflow saved my butt)

I use the ipinfodb.com API in my app a lot to do geolocation. Well, as such things are wont to do, it went down for about an hour today. Curiously, the production site wasn’t affected at all, but my dev work was. So, in panic mode  I needed to add a country selector so that the rest of the site would have  the country information it needed. Now the reason there wasn’t one already was a deliberate design choice. So, i needed a backup plan for the next time ipinfodb went down.

As usual, Stackoverflow saved my butt (again). There’s a great answer that explains the way to do things in MVC using Razor.

Rather than steal the guys thunder, I’m just going to add one recommendation. In the body of the javascript function add the following:

location.reload();

And the page refreshes, including any changes triggered by the selection.

Go on and read the answer here.