Load Balancing with the Azure App Service File Connector
June 25, 2015 2 Comments
The File Connector is one of the built-in protocol apps that are available in the Marketplace when you go to provision and API App. Through configuration only, this app allows you to perform file-based operations from a Logic App in Azure, bridging the boundaries of your corporate network:
The documentation from Microsoft clearly explains how you can configure the app and then download a listener agent to install on your on-premises server associated with the file(s). In most cases, this would be a single server – but I got to wondering what would happen if you installed the agent on multiple servers? So I tried it out using both the “Get File” and “Upload File” operation.
Turns out that the File Connector will talk to all of those servers, provided that you have set up the same base directory path on all of them. This path is configured at the time you provision the API App – not when you use it within a Logic App. The configuration of the “File Path” property within the instance of this app only defines the sub-directory within the base path, as well as the file name. Incidentally, if this sub-directory does not exist at runtime, it is automatically created for you in the case of the “Upload File” operation. Unfortunately, this is not the case with the base directory – if that doesn’t exist you get a rather meaningless 500 error recorded in the tracking log:
The requested page cannot be accessed because the related configuration data for the page is invalid.
In any case, I set up a simple Logic App and decided to test using the “Get File” and ‘Upload File” methods respectively to fetch and store files from assorted VMs hosted on my laptop. Once I had the syntax down for specifying the file path (this post by Saravana Kumar proved very helpful), then I was amazed at how easy it was to get this working! Here is my Logic App:
Of special note is the expression used for assigning the output filename:
This expression simply pre-pends the UTC timestamp to the original filename, so we can easily track when the file was processed by the Logic App.
I also created a simple console app that would copy a file to the outgoing directory once a minute, giving each copy a sequential number in the filename.
As for the multiple on-premises servers, here’s what I discovered:
Sending Files to Multiple Hosts
It turns out that the File Connector will load balance between all configured listeners that are active. When I say “load-balancing”, I don’t mean necessarily performing any intelligent assessment of resource usage; not sure how an Azure-hosted service could determine which on-premises server is least busy. in my tests it just seemed to randomly alternate amongst all available hosts, although it distributed fairly evenly across the twenty transactions:
|Server A||Server B||File Created||File Written||Total Time|
Of course in this scenario, you cannot rely on the file being written to any particular host server – but then if you needed that functionality, why would you consider load balancing in the first place? The important thing to realise is that only one host will ever receive any individual file; this is not a multicast scenario where each host gets its own copy of the file.
Receiving Files From Multiple Hosts
Here the results were a little less consistent. By installing the agent on multiple hosts and configuring the “outgoing” drop directory (i.e. the folder path for the trigger File Connector), I discovered that the Logic App did indeed pick up files from both locations. However, it seemed to have problems with deleting the file the file once it was retrieved, thereby causing the same file to be processed multiple times.
In looking at these results, consider that I had my console app running on both “send” servers, each dropping one file per minute, but the 2nd server’s app initiated 30 seconds after the first. I also included the host name in the sample file for each so I could easily derive the source of each file on the receiving host:
The chart plots the elapsed time from each file’s creation point to the time it was written to the target host. The lines connect the initial writes, which the isolated dots represent subsequent writes of the same file to the target host. Looking at this data you can observe some interesting characteristics:
- The first three files were processed pretty much as expected.
- From that point on, the File Connector clearly had issues with deleting the file once it was picked up (I observed this myself during the test: I saw the file being written on the target host whilst it still remained in existence on the sending host).
- Despite the fact that the source files were being written at regular intervals, the processing was not so regular; even though the trigger recurrence was set at one minute, some “batching” seemed to occur.
While it’s possible that there were some issues with my own servers or network connectivity that impacted on this test (i.e. preventing files from being deleted), it does seem clear that this multiple receive scenario is not a good idea if you require that a file be written to the destination exactly once – unless of course you implement some additional behaviour in your Logic App to guard against this.
There were a few things I had to do to get the sending of files working with the trigger File Connector:
- I needed to give the “Network Service” account full control permissions on the folder where the outgoing files were being dropped. This is because the IIS service that is created by the File Connector (through the Hybrid Connection Manager) runs under this account, and it needs that level of access to be able to delete the files once they are picked up.
- I needed to modify the Web.config file for the File Connector service as per this blog entry in order to avoid the following error reported in the App Service tracking:
Error: cannot add duplicate collection entry of type ‘mimeMap’ with unique key attribute ‘fileExtension’ set to ‘.json
All in all, I found the send (upload) feature of the File Connector to be pretty stable (even if error reporting leaves a bit to be desired). The receive (get) feature is a bit fiddly, though… sometimes files were not readily deleted even with a single server feeding them to Azure. But given that this is still in preview, we can hope that things will continue to improve.