Wednesday 26 June 2019

Storing text on Dropbox...

Sometimes amazing stuff crops up by accident. I use Dropbox, the cloud-based online storage service, to store files that I want to share between my computers. mobile, etc. Dropbox has served me well for many years, and I have rarely needed to contact their customer support.

However, a chance happening alerted me to something interesting that I hadn't considered before. Someone sent me a 'download-only' link to their Dropbox so that I could get some files, but one of those files was just a placeholder for a file that they would upload later to complete the set. All perfectly ordinary stuff that people use to do business every day.

But that placeholder was interesting. I accidentally tried to download it along with all of the other files. But it wouldn't download and I got an alert warning me that it was 'zero bytes' long. It was at this point that I got interested, and I generated some test files to learn exactly how Dropbox stored them. It seems that if you store a zero length file, you can give it a short title (a few words), and it will occupy zero bytes of your Dropbox storage quota. Now zero length files are easy to produce - I just used an ASCII editor (I did a whole blog post on the topic from a couple of months ago...) and named and saved the empty file direct to Dropbox. After a bit more experimentation, here's the resulting directory listing from my Dropbox 'Files' page (slightly edited):


As you can see, I wasn't particularly inventive in my choice of words, but just in case, I contacted Dropbox customer support to let them know that it is possible to store text on Dropbox without affecting your storage quota, and they agreed with my analysis. They also pointed out that you could also use folder names...

Now I tend not to use the browser interface to Dropbox very much, but the directory page is interesting. It gives you a timed list of your activity on Dropbox, so for my recent activity, it showed the lines from Shakespeare in order, scrolling vertically as each new line was added... 

Now, this was very cool! Dropbox effectively gives you, for free, a performance tool that lets you publish short lines of text on Dropbox, where those lines of text from the file titles are displayed on a web page, in order, and timestamped. So as each empty file is added to Dropbox from a browser or just dropped into the local Dropbox folder on your computer, the Shakespearean soliloquy gets displayed one line at a time, using the text from the title of the file, with previous lines scrolling upwards or downwards (you can control this from the directory page)... No need to refresh the page - it just works.

The only catches are: 
- the title has to be short enough (you can see that some of those lines are getting close to being truncated), and 
- you can't put any punctuation in the text, except for symbols that would be allowed in file names anyway...

Well, I didn't think just virtual thespians at this point, I also thought about song lyrics. It seems that you could potentially store song lyrics (assuming you have the rights to do so, of course), one line at a time, on Dropbox, with the right timing, and Dropbox will scroll them on the display as they are received. And this costs you nothing extra on top of whatever you normally pay for Dropbox. The files are empty, so they don't add to your storage quota! Making an app that does timed saves of empty files with titles from the individual lines in a text file is not that difficult, and Dropbox does everything else... 

I'm now wondering if there is anything else that can be done with free text storage... Subtitles, commentaries, live comments, etc. There's a lot of things o consider before designing a solution based on what Dropbox provide: the latency and jitter of update times, for example, may mean that lyrics or subtitles aren't a good application. But the interesting thing is that there's going to be at least one application that is a perfect fit for this...

I asked Dropbox Customer Support if it was okay to publish this, and they said it was fine. So now you know too! I know it's only minor and trivial really, but for me, it is fascinating to discover that something like this is possible. My analysis follows below for those who are interested in system design...

(I'm just wondering what Charlie Brooker of 'Black Mirror'-fame would make of this...)

Analysis

Here's some serious analysis of this as a system design problem. 

Whenever you design a system, there are the things you want it to do, and a security analysis will give you pointers towards things that you don't want it to do. You can do risk analysis on those unwanted features, and decide on appropriate mitigations to reduce the residual risk to acceptable levels. But there's a hole in this very conventional process, and that is the things that you didn't specify that also are not security risks. These are the unexpected features, and they can be very interesting.

Risk analysis for security has evolved over a long time, and it has a sophisticated set of processes, approaches, tools and practitioners that are all based on lots of experience of looking at systems from a very specific viewpoint - security. These days, an additional and allied parallel activity has become important because of legislation like GDPR, and that is Privacy. Once again, there are processes, etc., and practitioners who are skilled at looking at systems from that viewpoint, and again the analysis leads to risks which result in mitigations to minimise the residual risk, etc. 

But there's that interesting hole where the 'things that you didn't specify the system to do, but it does them anyway, and they don't have any security or privacy implications'. These things fly 'under the radar' and I've never seen very much in the way of any formalisation of process or approach to identifying them, assessing them, and deciding if mitigations are appropriate. Unintended consequences are probably acceptable to some level with simple, non-networked systems that aren't critical in an way - for life-support, emergency cover, critical infrastructure, etc. But as systems become more interconnected, more networked, more 'Cloud'-based, then that little word 'unintended' starts to become more significant. 

In the case outlined above, a design would look at what can happen, and would assess it based on the consequences. Having folder names and file titles outside of the storage quota may seem like a pragmatic solution to a user wanting to know how much storage they are using, but actually, the user's viewpoint of the storage differs from the actual storage that a provider like Dropbox has to provide, because the filing system itself uses storage, and users would probably not want to pay for that storage, even when it turns out that they could actually be using it themselves to store information for free.

What is more concerning are the repercussions of the unintended (or maybe, 'assessed as insignificant') features of a system when that system becomes stressed in some way. My completely uninformed suspicion is that the designers of the Dropbox system probably assumed that users would store files in folders, and that the titles of the files and folders would be insignificant in size in comparison to the actual content of the files themselves. But all it takes is someone to make an app that uses Dropbox as a filing system for song lyrics, subtitles, commentaries, etc., and things might start to escalate. 

The first protection mechanism that would probably be hit would be a limit on the number of folders or files within folders. But if I had designed this, then I would be expecting that this would be linked to a lot of storage as well. The 'user story' for lots of files of zero length isn't something that I would have thought of, and so is an 'unexpected' feature that I would have probably missed in my design. And if a designer misses a feature, then is it tested thoroughly? I'm pretty sure that the consumed storage is a key parameter used to monitor a user's account - I know this because Dropbox is very good at telling me how much storage I use, and especially when I'm getting close to using it all up. Selling me additional storage is good for me and good for Dropbox. But if lots of users were to start using a filing system that exploited the folder and file name 'free' storage feature, then the number of files and folders being used might start to become an important parameter for Dropbox, because suddenly the design rules that said that the storage required for that purpose was insignificant in comparison to the actual chargeable storage space consumed by files uploaded by users, would be wrong...

Changes in design rules when you have a system launched and operational can be awkward, and they can even be potentially expensive, or maybe catastrophic. One of the reasons that I contacted Dropbox when I found out about the zero length file storage was because I was curious about their  response. Dropbox were very open about the storage, and agreed that I could publish this blog on what I had found. It will be very interesting to see what happens next... 

Now if I was Dropbox, then I would be assessing exactly what the consequences might be if a lot of apps started loading the Dropbox system with file and folder title metadata, and those design rules would probably be revisited. Additional checks and catches might be put in place to check for lots of zero length files, and limits for the number of zero length files could be announced and policed. Processes for designing solutions might be revisited, and new checks and balances added to try and catch unintended consequences in future designs.

But actually, there's something much more interesting that Dropbox could do, in parallel to this. They now have a lead in the 'analysis of unintended consequences of plain ordinary features in large systems', and by the time they have done all of the analysis and fixes, they will be world experts in what you need to do to mitigate against this type of potentially lurking problem. This sort of 'hard-earned' practical and theoretical expertise from actually solving a real-world problem is worth a LOT of money, and the next unintended consequence in someone else's system might be something much more damaging and dangerous, particularly if it isn't something that security or privacy risk analysis would have caught (and by its nature, it very probably will be!). At this point, that Dropbox expertise could well be one of their most precious commodities...

I have to thank Dropbox customer support for their help in this. They were wonderful.

All free!

All of the analysis here on this blog is free, of course, (I am a CISSP in the real-world until October 2019, but this is pro bono because it is fascinating...) but donations are always welcome!


  


No comments:

Post a Comment