Technology – Page 2 – Megan Walker's Blog

January 31, 2014

Stripe CTF3 write-up

I’ve been kinda distracted throughout work for about a week now, because of the third Capture the Flag competition hosted by Stripe. The first CTF was on application security – you had a ‘locked down’ account on a computer, possibly with various programs available. And you had to break out of the account to read a file you weren’t meant to be able to access, containing the password for the next level. I wasn’t aware of that first competition.
The second one was on Web Application security – you were presented with web access to a web app, and the source code for the apps (each level in a different languages), and had to exploit various vulnerabilities in order to get the password for the next level. The web apps ranged from basic HTML form processing, to rudimentary AJAX twitter clones, to APIs for ordering pizza. The vulnerabilities ranged from basic file upload validation to SHA-1 length extension attacks, Javascript injection all culminating in a level that involved using port numbers to dramatically reduce search space for a 12 character password. I completed that one and won a t-shirt.
The third one, the one that has just happened, was different. It was ‘themed’ around Distributed Systems, rather than security. You’d be given some sample code that you’d have to speed up, either by finding bottle necks in the code or by making the system distributed and fault tolerant. Spoilers will follow. Although the contest is now over, the level code (and associated test harness) is available here if you still want a go. I will note that it’s entirely possible to rewrite each level into a language you’re familiar with (I didn’t take that approach though, given that half the fun is not knowing the language).
So. To details.
I didn’t mange to finish it, although I made it to the last level, of which I sunk far more time into than was healthy – I’m fairly certain my tiredness at work for a couple of days was due to this level.
Level 0
Level 0. A basic Ruby program that reads in text, and if a word appears in a dictionary file it will enclose the word in angle brackets.
I didn’t know Ruby, but I had a good inkling of where to tackle this level, given how simple the program was. A quick google of Ruby Data Structures, and Ruby’s String split() method confirmed my idea. The original code did a string.split() on the dictionary and then repeatedly looked up each word against the Array that function returns. By transforming that array into Ruby’s notion of a Set, I could gain the speedboost from super-fast hash based checking.
I also modified the comparison to do an in place replacement as it saved the cost of duplicating the entire string. I’m unsure how much weight that had against the Array->Set change.
Level 1
A bash script that tries to mine the fictional currency of Gitcoin. Gitcoin is essentially like Bitcoin. You “mine” gitcoins by adding a valid commit to the repository. That commit must modify the ledger file to add one to your own total of gitcoins. A valid commit is one whose commit hash is lexicographically less than the value contained in difficulty – that is to say, if the difficulty contained 00001 your commit hash would have to start with 00000[0-F]. Because of how git works you have to find such a commit before anyone else mining against the same repository finds a valid commit.
There was one main thing in this level to fix. And that’s the call out to git that mock hashes the commit object to see if it’s valid. If it isn’t it alters the commit message text in some way, and then hashes again. This is slow. It’s slow because of a couple of reasons. Git likes to lock its repository files during operations, so you can’t do parallel searches for valid commits. But also because git objects have to have a very specific format, which git takes time to go and generate before returning the hash. The final thing is that each commit contains the hash of the parent commit as part of it, so naturally should another miner find a gitcoin before you, you have to start the search over again.
To achieve this, I moved the SHA1 testing over to Python. I formed the commit object that git creates manually – the header consisting of the term “commit ” and the length of the commit body, with a null byte. I left the body (which itself has to have a very specific format) as it was in the original script. I called pythons SHA1 library to do the work, which is a non-blocking operation, thus meaning I could set 8 separate processes going at once, each trying a unique set of commit messages. Upon success they then spat out a commit message into a file.
Annoyingly my solution then became quite clunky, with myself manually changing the filename to read in a copy of the original script that bypassed the searching. That pushed the correct commit. Ideally I’d have automated that into the first script, but it was enough to get me a valid commit pushed to Stripes git servers, thus meaning the next level was unlocked.
Incidentally this level had a bonus round, where instead of being against four stripe bots mining, you’d be competing against the other players who had completed the level. Needless to say, people very quickly started throwing GPU based SHA1 tools at it, and I was outclassed by a wide degree.
Level 2
Node.js, again I had no experience (although I do know javascript). You were given skeleton code that had to mitigate a DDoS attack. Your code would be placed in front of an under attack web service, and it had to ensure all legitimate requests got through, and strangely enough illegitimate requests to keep the servers busy, but not falling over. (You lost points in the scoring system for how long the target servers were idle.
In practise this was rather simple as the requests were easily differentiated – each legitimate IP would only make a few requests and relatively far apart. Simply keeping a log of the IPs seen and when they were last seen was enough to differentiate the legitimate mice from the illegitimate elephants. You also had to load balance between the two servers that were available – they would fall over if they had to process more than 4 requests at a time. You knew how long it each request would have before the backend servers timed the connection out, so by keeping a log of when each request was proxied, and to which server, you could check how many requests were likely to still be on the server.
Pretty simple.
Level 3
The penultimate level. Scala. I had great troubles with language on this one, I suspect partly because it’s close enough to Java that I get confused mentally when translating what I want to do into the scala syntax.
You were given four servers – a master one, and three slave servers that would never be contacted by the test harness. You were provided with a directory which you had to index all the files under. Then you had to respond to a barrage of search requests (for which you were also expected to return substring matches).
The default code was incredibly poor, so there were some immediate optimisations that were obvious. Firstly, the master server only ever sent search requests to the first of the slave nodes, which also had to index and search the entire corpus. There’s two approaches now – split the corpus and send each search to all nodes, or split the searches but make each node index the entire corpus. I went with the former. I split the corpus based on root subdirectory number. So the slave0 would index when subDir%3 = 0. Any files directly under the root directory would have been indexed by all nodes.
The second obvious improvement was that the index was an object containing a list of files that the searcher needed to search. That object was serialised to disk, the searcher would read that in. Then for each query it would go off and load the file from disk before searching the file. My first change was to never serialise the object out, but keep it in memory. That didn’t make much of a difference. Then two options presented themselves. I could try constructing an Inverted Index – that would contain each trigram (as I had to handle substring searches) and a list of the files and lines where that trigram was found. Or I could take the lazy option of reading all the files in at indexing time (you had 4 minutes until the search queries would start) and storing those directly in the in-memory index. I took that option. I transformed the index list into a HashMap of FilePath to Contents. And that pretty much got me to pass. Somehow. I don’t feel like that was enough work myself, but that was more than made up for by the last level.
Level 4
I couldn’t crack this one. I tried for days. I think it was from Sunday through Wednesday, excepting some time out for the day job.
The language was Go. I know no Go. The challenge: A network of servers, each with a SQLite database. The network is unreliable with lag and jitter randomly added, and network links being broken for seconds at a time. Search queries will be directed to any of the nodes for 30 seconds. All answers they give as a network must be correct. You are disqualified instantly should you return an inconsistent answer. You gain points for each correct response. You lose points for every network byte of network traffic. Oh, and unlike in the other examples, the sample code they provided you with doesn’t pass the test harness – it gets disqualified for inconsistent output.
So. This level was about distributed consensus – how to get multiple nodes to agree on the order of operations given communication problems. I’m just thankful we didn’t also have to contend with malicious nodes trying to join or modify the traffic. If you could get traffic through it was unmodified.
The starter help text contained pointers to a Distributed Consensus Protocol called Raft. Vastly simplifying the intricacies: Nodes elect a leader. Only the leader can make writes to the log (in this case an SQLite Database). The leader will only commit a log once a majority of nodes have confirmed that they have written to the log themselves. If the leader goes missing, the remaining nodes will elect a new leader.
There’s a library already written for Go, Go-Raft. This seemed like a sure fire winner. Just drop in Raft right? Although dropping the library in was very easy it wasn’t that simple. Raft is a very chatty protocol requiring heartbeat signals, leader elections and in our case, request forwarding to the leader as followers do not have authority to commit to the log.
Beyond that though, the go-raft library had issues. It didn’t work with Unix Sockets (that the test harness required) out of the box (although Stripe had had a commit merged into Go-Rafts master branch that made fixing that extremely simple. It could fail to elect a leader. It also had a bug that seemed to bite a lot of people in IRC – I only saw it once, and I’m still not sure on what exactly the cause is – I suspect a missing/misplaced lock() that caused a situation with the log that is fatal for the raft consensus algorithm.
After battling with Unix sockets and getting an excellent passing score locally – at one point I got 150 points normalised, whilst you only needed 50 to pass, I pushed to remote. And it fell over horrendously. I ended up with a negative point score before normalisation. Needless to say that was demoralising. It turns out that reading the original Raft protocol paper, understanding it theoretically, and getting it to work with some easier test cases is very different to getting it to work in a much more hostile set of conditions.
My problems on this level were compounded by the infrastructure regularly falling over and needing the Stripe guys to give the servers a kick or 10.
But beyond that, I feel that there’s something I failed to grok. When my connections could get through it worked fine – SQL was always consistent, leaders were always elected, requests were forwarded properly (barring one case that I have since read about where the request is forwarded and executed successfully but the response is lost due to jitter). And yet when running on remote I either suffered from End of File errors (i.e. socket closed), or requests timing out. Although I eventually managed to reproduce those issues locally by manually downloading the test case, it didn’t help me in diagnosing the problem – I regularly had a case where one node, in the entire 30 second test runs, never managed to join the consensus (which takes a grand total of one successful request to do). And I didn’t know what to do. I think that the most valuable thing this level taught me, beyond the theory of distributed systems, is how bad I am at fixing problems when there’s no errors that are directly caused by my code. As far as I could tell everything was fine – if I ran it manually without the test harness in the middle it all worked. But when I put the test harness in, it all fell over. My own logic tells me that therefore the problem must be with the test harness. Except I know people managed to pass the level with go-raft. I need to go and look at some solutions people have posted to see how they coped.
At the end of the day, however fun this was overall, the last level left a bad taste in my mouth – the infrastructure problems were pretty endemic especially nearer the end, and the difference between local and remote in the last level was absolutely disheartening. I can accept some difference, but a score that locally is three times higher than the threshold (after normalisation) shouldn’t get negative points on remote. I just wanted the T-Shirt!

November 20, 2013

Choosing a new phone

In the vein of a previous post exploring why I chose to move my email over to Office 365, I shall today be exploring how I chose my new phone.
Or, more specifically the OS of the phone (given that hardware doesn’t interest me as a thing – one black fondleslab is much like another black fondle slab).
As that previous post indicated, I currently have a BackBerry. Not one of the new BBOS10 ones, but an older one (although it was new when I took my contract out).
The phone market today is radically different from the one where I first switched to BlackBerry (4 going on 5 years ago). BackBerry has essentially died a death ( in the consumer market anyway, we’ll see if their refocus back on enterprise, and the opening of BBM to other phone OSs makes a difference). Android has risen to become the dominant phone OS – although the device manufacturers haven’t quite got the hang of OTA updates and multi-year support (I’ll get to the issue of so-called bloat-ware in a minute). IPhone and it’s iOS has seen a more sedate rise, but has figured out OTA updates that cut the carrier out of the picture. Windows Phone has also emerged as a serious contender.
Between them, these 4 OSs have the overwhelming majority of the market – few people could name any other OS that is still going today. This post will take each in turn to weigh the advantages and disadvantages, for me according to my needs and desires. I make no claims that my answer is the one true answer, or that even my disadvantages won’t be someone else’s advantages. (although I am right, and everyone else is wrong).
BlackBerry
There’s no denying that BlackBerry has had a rocky road recently. Their latest OS 10 is a major shift from their previous direction. A major UI overhaul, coupled with keeping their excellent security features should stand them in good stead in this battle. But alas, I don’t want another BlackBerry – their troubles don’t speak well for being around much longer, or at the very least that consumers will not have the focus they once did. BBM is something I rarely use, and even if I did there’s no longer any need for a BlackBerry itself. Even email, the killer feature it handled exceedingly well, is no longer a differentiator – the competition has caught up, and BlackBerry hasn’t advanced. Their attempt to boost their App Store by making their OS ‘compatible’ with Android apps, to me speaks of a desperation. A last gasp as it were. Perhaps it will be enough, perhaps not. But I don’t want to be take the risk that I’ll be left with an unsupported brick a couple of years down the line (phones to me are at least a two year investment, if not more).
Windows Phone 8
A relatively recent contender, Windows Phone 8 is Microsofts latest attempt to break into the mobile market – a successor to the previous Windows Phone 7 and the Windows Mobile OS family. It inherits a lot of its look from Windows 8 and its Metro UI, and this certainly makes it the most distinctive of the OSs out there. Yet it hasn’t been a massive success, although it is showing steady growth. Perhaps it came too late to the market, or perhaps it hasn’t been marketed well – a common feature of Microsofts mobile attempts. One thing is certain though – app developers haven’t gone crazy for it. Despite the fact that I only use a core set of apps on my phone regularly (mostly social media), I do like to try out apps, and part of me wonders if it’s due in large part to the fact that BlackBerry’s app selection is abysmal.
iOS 7 on iPhone
I already have many Apple devices. I use a Macbook Pro at home, I have an iPod Touch which is my media center, and I have an iPad which sees infrequent use. I have a large collection of apps on my iPod, although again I only have a core set that I actually use. So surely an iPhone is a natural next step? Well, maybe not. iPhones are expensive (I know, that’s hardware – but unlike the other OSs, Device and OS are tied together here). I already have an iPod Touch for all my Appley needs. I know of no-one who uses iMessage or FaceTime – so those have no appeal. My apps are already on my iPod Touch, and I don’t hate the wifi-only nature of it. There’s also Apples iCloud, which is very much a walled garden as far as syncing services go. I use it as minimally as I can for my needs right now (mostly to save connecting via cable to transfer photos).
Android
Oh Android. Google’s attempt at a mobile OS. Phenomenally successful. Open Source, except for when it’s not. Android. It came on to the scene with a terrible UI at the time, although the UI has improved dramatically with recent revisions. But then, with Android, the UI is kind of moot. It’s open source (except when it isn’t), people have written entirely separate launchers and themes – see many of the carrier/manufacturer branded versions for examples. In fact, this really makes it very hard to talk about Android with any meaningful detail. Google’s Android is very different from the Open Source Android – the Keyboard with the cool swipey-pathy-typey-thing? Closed source. Googles Mail app? Closed source. It’s well documented that Google has been closing down Android slowly but surely. And although you have the possibility of side-loading apps, very very few are actually distributed like this. They almost all go through Google Play Store. It seems that Open Source is a flag google wave for community support, to blind the community to just how hard it actually would be to create a successful Android fork – look at what Amazon has to go through to clone the APIs provided by the closed source Google Play Services. CyanogenMod also have to dance around the redistribution of the closed APIs that many apps assume are present, by backing up the original Google Apps, and then reloading it after their version is flashed. Also, how meaningful waving the Open Source flag is when the core platform APIs of the project are developed in private is…….. yeah.
I make no secret that I don’t trust Google these days. You are an advertising target to them. Everything they do that is intended for consumers will eventually feed back into their advertising algorithms. This is why it will surprise you that I went with Android as my next phone OS. I’m not sure yet, how I’ll remove or limit Googles tendrils on the device. Running stock AOSP? Possibly if I can get my social media apps to work without the Google Play Services. Using a separate account for Play Store things? Possibly. I’ll most certainly be limiting apps permissions as much as possible. I was surprised to learn that Android only recently got the ability to limit GPS access on a per-app basis – iOS has had Location Services control for ages. Perhaps I’ll put CyanogenMod on it, although frustratingly I can’t find a full description on their site of what changes they actually make to AOSP. I’ll certainly disable Google Now, and its always listening “Ok Google”. I’d better buckle up, because this is going to be an interesting ride. Especially as I find apps I just want to try if only for 5 minutes.

September 6, 2013

Safe Spaces

Quick note before the post proper: I do have an analysis of The World Inside in the works, but it’s proving rather troublesome to tame into coherence. For now, this post.
—
I regularly see people declaring that somewhere or other is a “safe space”. So let’s pick this concept apart, and see just how easy or hard it is to create a safe space.
What is a “Safe Space”? The idea behind it is simple – a place where someone (usually of a minority group, though not necessarily) can write, talk, discuss their beliefs without any mockery, without trolls, and without a risk of them being offended (or in some cases distressed – known as triggered) by content within the area. For example, a Safe Space for a Homosexual person would be a space where you aren’t condemned for being homosexual. You won’t be mocked with homophobic slurs. Such a person can talk frankly about their experience. A transgender person meanwhile would have a space where they won’t be called and number of the transphobic slurs, nor would they be confronted with such slurs unexpectedly.
Sometimes you see people claiming that a particular tumblr tag is a “safe space” and that people should keep their hate out of the “transgender” tag, for example. This, is a futile request. The nature of tagging (on most sites, including tumblr) is that tags are public and unmoderated (beyond generic site-level moderation). Such tags will naturally be used by anyone who wishes too. And whilst sites may have “community guidelines” and so forth, against homophobic material etc, such policies tend to rely on user-reporting, and notably tend not to be as strict in their moderation as safe-spaces require.
Another angle is for example the /r/lgbt subreddit, which claims itself as a safe-space for any and all gender, sexual and romantic minorities, and it does work. Kind of. Reddit provides subreddit moderators (sub-reddits are essentially a forum board) with tools to remove any posts they wish. And this subreddit in particular has very pro-active moderators ensuring that any (even slightly) anti-lgbt material is removed quickly. So they have a safe space. Great. except, as is common in the case of highly active moderators, anything that doesn’t fit with their world-view is also removed. As such it creates a community that is perceived to be ‘all on the same page’. Even posts that aren’t anti-lgbt, but question, for example the ever expanding alphabet soup are removed.
Moving into the real world, a “safe space” tends to be a meeting area where there are people in authority, with the power to remove people from the space – such as University LGBT societies. These tend to be less prone to the ‘heavy-handedness’ of internet community moderation – by virtue of the fact that without the online disinhibition effect (Something I learned a great deal about for a University coursework) the number of trolls and “extreme” views tend to be minimised.
That said, online “safe spaces” are needed – providing people who experience homophobia, transphobia, and even things such as sexual assault or have attempted suicide, an area where they can pseudonymously communicate with others in the same boat is vital. It encourages the community to connect, to network, and thus to become stronger. And it insulates them from the problems that they face elsewhere in life (sometimes frighteningly regularly).
Safe Spaces need to be actively moderated, otherwise they are impossible to maintain. But it is important to recognise that this moderation can go too far, which can cause a narrowing world-view and even rejection, not acceptance, from the wider society (or even from within the same minority group – see the split from /r/lgbt to /r/ainbow).

August 8, 2013

On Pornography

Welp. Cameron’s done it. Bent over backwards to introduce unworkable, unrelated policies in a confused mess designed to appeal to comfort traditional Tory Middle-Class ~~Daily Mail reading idiots~~ I mean, voters.
So let’s look at the proposals he has outlined.

A ‘crackdown’ on those accessing child pornography/ child abuse images.
Internet Filters that will by default block access to all pornography on those using residential ISPs.
The criminalisation of simulated rape pornography.

The crackdown.
I don’t think many people would disagree with child sexual abuse being absolutely disgusting. My mum was a Special Educational Needs teacher, and she has worked with children who have been abused. It is so wrong the damage it can do to them. That out of the way, let’s have a look at this
The way this is currently handled is you have CEOP, a branch of the police, who track down the people committing the abuse, rescue children, and find people who are viewing the content. You have the IWF, an independent charity who handle reports of child abuse images submitted by the public. They create the blacklist of URLs that is passed to search engines and ISPs to block access, and filter out those pages containing the content. They also forward information to CEOP and equivalent agencies worldwide after deeming content to be potentially illegal.
The proposals include getting search engines to redirect results, so someone searching for “child sex” for example, might get results for “child sex education”. There will also be pages displayed when someone tries to access a page blocked under this scheme that will warn them that looking for such material is a criminal offence. I imagine it would look similar to the ICE notice placed on seized domains by the US Government.
The thing here though, is that Google (and most other search engines) already remove results pointing to child abuse imagery. My thoughts on the IWF being the determiners for what gets blocked (which they already do) are long enough for another blog post – but suffice it to say, I’m not sure that an independent, unaccountable charity should have “special permission” to view and classify the images without any form of oversight – especially as it’s generally hard to work out that something has been blocked – See the Wikipedia Blocking Fiasco. I have another point about the effectiveness of blocking content – but that will be the main thrust of the next section.

Blocking of Pornography
So, the second issue is the implementation of filters on Residential UK Broadband connections that will prohibit access to porn, should the account holder not opt-out of the blocks. This is a further example of how our internet use is getting more and more restricted over time. First they had CleanFeed, which blocked the IWFs list. Then they blocked The Pirate Bay and other similar sites. Now they want to block Pornography (albeit on an opt-out basis for the moment).
So, firstly what is pornography? Images of oral, anal or vaginal sex? How about “Kink” images of bondage, where no genitalia are visible? Pictures of female breasts? Cameron has already announced that Page 3 won’t be blocked.

How about the written word – many fan-fiction pieces get very steamy, not to mention the entire erotica bookcase at your local bookshop (or Sainsburys).
Of course, our mobile internet connections are already filtered by default – so we can look at those to see what will be blocked. “User-generated content sites”. Oh yes, I suppose they could contain pornography. Reddit in fact has many sub-reddits dedicated to such things. ISPs have even indicated that categories such as “anorexia”, “web forums:” and even “esoteric content” may be blocked. Of course, one natural side effect of that will be the (accidental) blocking of sexual education resources. No filter is 100% perfect, so it’s inevitable that sites will get blocked. We can look at what mobile operators have blocked “by mistake” in the past – a church website blocked as adult, a political opinion blog(!) and even eHow – a site that posts tutorials and educates on how to do everyday things.
This is to say nothing of the LGBT websites that might be blocked – vital resources for any person questioning their gender or sexuality – but especially for young people who may not feel comfortable talking with their parents about these things. This by itself will actively cause harm (if these proposals didn’t cause harm I wouldn’t be so strongly against them), but there is further harm to come from these – parental complacency.
There are bad parents. There are parents who don’t communicate with their children. We all know they exist. And any right minded parent would fear their children seeing something on the internet that they weren’t ready to see. But these filters will make parents think their kids are “safe”. That they don’t need to talk with their kids about sex, about things they might see on the internet, that they don’t need to use the internet with their children. So when children do stumble across adult content, they’ll be even less prepared to talk about it. And these filters suppose one thing – that the children are less tech-savvy than those writing the filters. Anyone who has worked with children, or works in Computer Software will know how fast kids adapt to new technology. Those older children who do want to seek out this material aren’t stupid. They’ll know how to get around these filters – unless you want to block searches for proxies (or VPNs for those more technically inclined). And all the time the parents will think their kids are safe, and wrapped securely in cotton wool. This is possibly one of the most damaging effects.
Simulated Rape Pornography
The final measure announced in this slate of news was the criminalisation of simulated rape pornography – aiming to close a loophole in Section 63 of the Criminal Justice and Immigration Act – affectionately known as the “Extreme Porn Law”. To be clear this proposal is talking about the banning of consensual, fictional “rape-play” images. For context – studies from the late 70s and 80s have shown that the idea of forced sex is one of the most common fantasies. Somewhat amusingly this announcement came shortly after the Crown Prosecution Service had adjusted the prosecution guidelines for offences under this act.
To try and criminalise images of consensual, legal things is utter madness. My objections to this are very much the same as my objections to the original section of the act. It makes the assumption that we are unable to distinguish between fantasy and reality. It makes the assumption that there is evidence of harm by looking at consensual images. We’re happy to let people run around and kill simulated people, but to watch a consensual act is somehow damaging. To me this stems from our cultures attitude towards sex in general. Which is that it’s something to be done behind closed doors, without disturbing the neighbours, and without discussing it afterwards. To something so natural, that’s a very weird attitude. It, incidentally, is the same reason I believe the pornography-blocking proposals will cause harm.
Summary
Overall, these proposals are terrible. They won’t work, they’ll cause actual harm, and they’ll make people with common fantasies feel victimised.
You can sign the OpenRightsGroup petition here, and a DirectGov ePetition here – although neither address the criminalisation of simulated rape.

August 5, 2013

Tor, Freedom Hosting, TorMail, Firefox and recent events

So, there’s been…. a lot of panic in the Tor community over the last 24 hours. Let’s have a look at some facts shall we?
Firstly, it would be good if you knew some basics of Tor – I have a previous article on it here. Secondly, forgive the number of Reddit Comments I’ve linked to – but given the lack of mass media coverage of this news, there’s not much choice)
News broke that the FBI had issued an arrest warrant and extradition request to Ireland for Eric Marques. The article frames him as a large distributor of Child Abuse Images. Whether that is accurate or not remains to be seen in court, but one thing that is (now) known is that he was the man behind “Freedom Hosting” which provided hosting for Tor Hidden Sites. A number of those sites apparently hosted Child Abuse Images or videos. It’s not yet known if he had any connection with any of those sites beyond being their hosting provider.
One immediate question that presents itself is how did they find out that this guy was operating the Freedom Hosting site? I haven’t seen any evidence on how this happened. It’s possible that they used a server exploit to find out the machines real IP address. Or that they tracked him down via other means (financial records etc), and then happened to find out he was behind it. Incidentally, the only evidence that the Tor community has that he ran it was the timing of all these events.
So, all the sites hosted by Freedom Hosting disappeared from the Tor network. Then, a few days later they showed up again. But this time, some (but not necessarily all) the sites hosted included an extra iframe that ran some javascript code (Link is to a pastebin, so is safe to click). Needless to say this javascript code is an attempt to break anonymity.
Now, a small amount of background. Tor (for end users) is mostly run through the Tor Browser Bundle these days. This combines Tor with a patched version of Firefox – to fix some anonymity leaks, as well as some Firefox extensions such as HTTPSEverywhere, and NoScript. NoScript is a Firefox extension that prevents Javascript from running according to the users preferences (block all, whitelist domains, blacklist domains, block none). Great, so the Javascript wouldn’t run? Well…. no. Tor Browser Bundle ships with NoScript in the “run all scripts” mode. Tor have had an FAQ about this setting up for a while. The short answer is that because Tor tries to look like a normal machine – always reporting a Windows NT Kernel (even on other OSs) for example, that disabling JS would leave you in a minority, as well as making it harder to actually use the normal javascript-reliant internet. Needless to say, Tor are reevaluating this tradeoff. This is especially true as their patches to Firefox should, in theory, make it harder for Javascript to break out and find the users normal IP.
So, this script can run. What does it do? Well it specifically targets Firefox 17 on Windows. Firefox 17 is the Extended Support Release of Firefox, which is what the Tor Browser Bundle is based on. Claims that this is a 0-day attack have been abound, but further examination has revealed that in fact, it had already been patched in Firefox 17.0.7 – which had been packaged into a Tor Browser Bundle at the end of June/early July. When you put this together it means that the script only affects users of old Tor Browser Bundles on Windows. The script appears to use the vulnerability above to try and send your real IP to another server. It also tries to set a cookie, presumably to track you as you browse the internet and onion land.
Notably TorMail, (a service which provides public email facilities over Tor), was also apparently hosted on Freedom Hosting, so far more than just people accessing Child Abuse Images are potentially affected. Anyone who wanted a truly anonymous email account has been affected. This makes it likely (although not guaranteed) that the FBI now have access to every e-mail stored on that server.
Freedom Hosting, whilst not the only Tor Hosting Service, was certainly one of the largest and well known. And TorMail was unique in its service. What this will mean for whistleblowers and others who used TorMail remains to be seen.

July 12, 2013

eBooks, Apple, Amazon and pricing

Given the recent US judgement against Apple in US courts for eBook price fixing, I figured my views would make a decent topic for another post here.
Firstly, some history.
When eBooks first started becoming mainstream (before iBooks was launched), eBooks were sold using the traditional “wholesale” pricing model. This model is the same as the one used to sell physical books everywhere. The publisher has a wholesale price they sell to retailers at, who are then free to determine their own pricing on their shelves – creating the situation where one retailer may have a book at $12.99, whilst another might have it at $11.99, with the wholesale cost being less than this cost (bookstores have storage, customer service etc overheads). It’s common for retailers to occasionally sell books as “loss leaders” – such things happen mostly with popular new releases – where the retailer chooses to sell the book below wholesale price (i.e. less than they paid) to encourage more people to visit the shop and spend more (due to feeling that they got a “good deal”).
The benefit of this pricing model is obvious – in theory market forces will lower the prices for the consumer by ensuring that there is competition between retailers, and new retailers can enter the ring to try and compete.
When Amazon first launched the Kindle the eBooks were sold with the wholesale model. However, Amazon sold every eBook as a “loss leader” in an attempt to sell more Kindles. They sold for $9.99, whilst being bought by Amazon for $13. Due to a combination of factors including Amazons (at the time) huge eBook market share – over 90% according to the WSJ; Publishers insistence on DRM causing consumer Lock-In; The possibility of Amazons pricing becoming ‘right’ (and thus a ‘loss for publishers); and the general tension between publishers and Amazon, Publishers wanted to raise eBook prices quickly. But with no major competitor their negotiation position was poor – at the time if they didn’t put eBooks on Amazon, their eBook sales would be decimated.
Enter Apple. Apple based its iBooks pricing on the model used in the iTunes Store – so called ‘Agency Pricing’. In this pricing model the publisher decides the retail price, and it has to be sold for that price – the retailer simply gets a percentage cut of that price (in Apples case, 30%). Suddenly the publishers could work with Apple to break Amazons stranglehold on the eBook market. Apple included a clause in the contract for the iBook Store that stated that eBooks must not be sold less elsewhere – i.e. if it was sold cheaper elsewhere then that price had to be used in the iBook Store as well. With these contracts in place the Publishers suddenly had a much stronger position to negotiate with Amazon.
For a short while Amazon held out – causing the infamous situation of an entire publishers catalogues becoming unavailable overnight. Eventually Amazon gave way and allowed publishers to use the Agency Pricing model on Kindle eBooks. Ebooks on Amazon now cost more than the old “wholesale” price due to collaboration between the major publishers and Apple. The seeds for the Price Fixing charge had been sown.
A couple of notes before moving in to my own opinions on the case. Some publishers are now experimenting with DRM-Free eBooks. The proliferation of alternative devices has meant that ePub is now the dominant standard for eBook formats – in all cases except Kindle which still uses the MobiPocket standard. Publishers also claim that physical production and transportation of a book is only a tiny fraction of a books cost.
—-
Right, my own thoughts on this. It’s clear that Apple and the Publishers did break the law. They collaborated as a cartel to raise consumer prices. One of the few kinds of behaviour that is ‘per-se illegal’ (illegal in and of itself) is horizontal price fixing, that is fixing the price across an entire market. There is no defence in law.
That said, morally I’m not sure it’s wrong. The market back when the Agency model was introduced was heavily skewed in Amazons favour (and the publishers position was incredibly weak). They were a monopoly, and charging all ebooks as loss leaders meant it would be very hard for another store to break into the market, unless they could offer something above Amazons offering (which being cloudy is very seamless). If Apple hadn’t managed to break Amazons pricing structure its possible that their own eBook store wouldn’t have been anywhere nearly as successful – especially given there is a Kindle app for iOS. Yes consumers ended up paying more for eBooks – but for the convenience of having them anywhere and having hundreds in my pocket I’d be willing to pay a premium. That premium was, in my opinion, needed as a means of breaking up the market. I’d also argue that Amazon was abusing its dominant position by selling as a blanket loss leader.
It should be said however, that now that Apple has been forbidden from using the Agency model, I have no idea what is going to happen to eBook prices now. The public has become used to the near-retail prices of eBooks, I doubt that a switch to wholesale would see any decrease in price from Apple.
Finally as a disclaimer: I don’t like Amazon as a bookstore. I say this whilst having Amazon Prime, and having ordered books from them recently, and many over the years. Their attempts to become fully integrated vertically by becoming a direct publisher, their questionable practises of remotely wiping books from Kindles, and their sheer advantage of economies of scale disturbs me. When possible I buy from brick and mortar stores, but I must hold my hands up and say that I do still order from Amazon – particularly if I don’t have plans to go to the nearest bookshop within 24 hours. I’m very weak willed when it comes to book purchasing. It’s entirely plausible that this has tainted my own opinions, although I hope they still make logical sense without them.

June 25, 2013January 7, 2019

Thoughts on UltraViolet

I’ve recently been able to take advantage of UltraViolet ‘digital copies’ of movies, and the experience has left me with some thoughts. I promise that this won’t turn into an Anti-DRM rant, beyond a brief mention of the rights you are guaranteed.
Firstly, for those who don’t know what UltraViolet is, a brief explanation. UltraViolet is essentially a service, backed by 5 of the ‘Big 6’ movie studios, that provides a ‘digital locker’ of sorts for digital copies included with select movie DVD/BluRay discs. You buy the DVD or BluRay in the shop, and included in the case is an ‘UltraViolet’ redemption code. After following the instructions you then have a digital copy of the movie in your UltraViolet account that you can then stream or download onto your UV compatible players. There’s also a feature of adding up to five additional accounts under your own to share UV library access.
At least, that’s the ‘non-technical’ version. What the service really is, is a collection of license files. That’s it. The Ultraviolet service itself doesn’t store any movie files, it just stores some DRM information stating that you have the license for this film. The actual files, streaming and downloading services are provided by other ‘UltraViolet Retailers’ – so, for instance a Warner Brothers movie will likely use the Flixster service – both being owned by the same parent company. The ‘UltraViolet’ website ‘stream’ and ‘download’ links merely point you to this other service. Which is an entirely separate account. UV mandates a choice of DRM schemes, and mandates a few select rights (streaming for free for one year, unless geographic restrictions come in to play), all other options are couched in “may” language, and thus might not be applicable at any point.
Because of this… decentralised…. nature, the core of ultraviolet is essentially nothing more than a thin veneer on-top of numerous different digital lockers. And thus is pretty useless. There may be progress in the future – UltraViolet appears to have plans to create ‘branded’ UltraViolet players that let you stream (or play downloaded files) from any of the UV retailers. This unification could actually provide some value.
Now, let’s talk about this in practise.
My first experience was when I bought Les Misérables on DVD. For this film they had also allowed an additional iTunes redemption (separate from the UV redemption – as UV is incompatible with iTunes[0]). iTunes redemption was simple – sign in/up to the iTunes store. Put the code in the redemption box, and done. The UV experience first required me to sign up for a Flixster account, then allow that to create a separate Ultraviolet account, put the code in the box, and it showed up in my UV library, and in my library on the Flixster site. As I only bought the DVD version, I was not surprised when I only received a Standard Definition digital copy. This however is where the cracks start to show. On the UltraViolet site the ‘download’ box is greyed out – implying that I only have streaming rights. The ‘watch’ link on the UV site simply takes me to my Flixster library. It doesn’t start the streaming of the movie – instead I now need to hunt again through my Flixster library to find the actual movie. On the Flixster site however it transpires that I can download the movie – into the Flixster desktop app. It’s this inconsistency of information, and the lack of one click integration that, to me, is UltraViolets biggest weakness.
My second experience was redeeming my ‘Harry Potter Wizards Collection’. This is essentially a giant box containing all 8 movies on DVD, BluRay, and a UV copy of all the movies, along with many special features and exclusives (No iTunes digital copy code however). The UV retailer was again Flixster, although this time I only needed to sign into Flixster, as they had ‘linked’ my UV and Flixster account together. Thus my previous notes about this particular retailer carry through. This time, however, on the UV site it shows that I have the standard and high definition version of all the films. However when I go to the Flixster site, I can find no way to actually obtain, through stream or download, a High Definition version of the films. This is a major problem in my view – and again, highlights how the lack of unification and integration is too apparent. When they’re trying to compete with iTunes for digital distribution, unification is one thing they need to get spot on. To be told in one place that you have the High Definition version, but not be able to get it is frustrating.
If UltraViolet feels like one thing, it really does feel like it was designed by a committee. None of whom wanted to let go of their vested interestes, each of whom has a different idea about what should be allowed, leading to this incredibly weak platform. If their UV Player idea ever takes off, then it would help. But fundamentally, the idea of a system that does nothing but say “John Smith acquired this movie on this date from this UV Retailer” is worthless on the technical side. To the end user it’s a glorified catalogue that doesn’t even provide one click access.
A final note – obviously I haven’t tried any of the other UltraViolet retailers, but I don’t have high hopes of them being any better either.
[0] – UltraViolet movies cannot be played in iTunes, nor loaded onto native iOS devices. However, UV Retailers (such as Flixster) do have iOS apps and native platform apps that have to separately download/stream the movie each time.

June 14, 2013

Why I run a Tor Relay.

Everyone has the right to freedom
of opinion and expression; this right
includes freedom to hold opinions
without interference and to seek,
receive and impart information and
ideas through any media and
regardless of frontiers.

– Article 19, The Universal Declaration of Human Rights

I make no secret of the fact that I use Tor, and that I run a Tor Relay. Admittedly I don’t use Tor as often as I perhaps should – but a lot of my browsing is on websites where I have an account that’s already associated with ‘real-world’ me, thus negating the purpose of Tor. Given the recent headlines, I figured now would be a good time to explain what Tor is, and why I encourage its use.

What is Tor?

Tor is software developed by the Tor Project that aims to ensure that your ISP and middle-men cannot correlate who you are with what you’re doing on the internet. The idea is that you download the Tor Browser Bundle – a one-click package of everything you need. Then when you browse the internet using the provided browser (a modified Firefox) you are routed through 3 volunteer relays in a way that guarantees forward secrecy, before the last node actually sends your request to the website in question. The way the encryption is set up means that when using Tor the ISPs, and anyone listening before your data reaches the first Tor node knows only that you’re running Tor, but not what you’re doing. The 3rd ‘exit’ node, and anyone listening to the connection between them and the destination website can only see your data, but not the original source. As far as the website is concerned your IP is that of the exit node. When you combine this with HTTPS connections secured with SSL even the exit node can only see the site you’re visiting, without seeing any of the data being passed back and forth. The EFF has a nice diagram summarising this.
Thanks to this it provides a way to secretly access websites that you may not be able to. In Turkey for example, it can bypass their web firewall so users can read news about what is really happening. Tor’s usage skyrocketed during the height of the recent Egyptian revolution.
A second feature of Tor are its ‘Hidden Sites’ these are websites who accept connections directly from Tor without having traffic go back over the ‘clear net’. This way there is no ‘exit’ node to spy on your data or site. Your connection is fully encrypted. There are Wikileaks mirrors for those wishing to view that data. There’s an e-mail service so you can send and receive emails (even interacting with clear web email addresses) entirely anonymously. The New Yorker has an anonymous document submission/communication platform as a Hidden Service (the second link won’t work unless you are using Tor).

What’s a Relay?

A Tor Relay is a computer with the Tor Software installed that has volunteered to be one of the middle-men in other peoples Tor connections. They provide the bandwidth that can make the Tor experience faster and more stable. An ‘Exit Node’ is a special kind of Tor Relay that has additionally volunteered to be the last Tor node before a connection jumps out onto the clear net. By running a normal relay I can help ensure that whistleblowers and dissidents can access the information they need to do their job.

Any downsides?

Like anything, it’s not perfect. There are known attack models – if all three of the relays you route through are run (or have traffic logged) by the same organisation, they could in theory perform timing analysis to work out which data stream is yours. There’s also the fact that it is significantly slower – streaming data is out of the question (even more so because Tor is TCP and doesn’t innately support UDP). And naturally like all privacy preserving tools, it can be used by the bad guys as well – Tor semi-often hits the headlines because of the Silk Road – a drug marketplace that operates as a Tor Hidden Service. But humanity is mostly good. Some people can do terrible things, but in the grand scheme of things I believe that the good it enables, and that humanity uses it for, greatly outweighs the possible negatives.

So who the heck is funding this?

One last point. The Tor Project makes no secret of the fact that they were originally founded and are funded by the US Navy. But as you can see, there is now a wide diversity of funding coming in. And if that isn’t enough, Tor’s Projects are entirely Open Source (including the core Tor code) – you can download the code, submit patches (please do that!), and check there are no back doors.

Did I see mention of a free T-Shirt?!

Yes. Yes you did. I have my free t-shirt. Get yours for running a 500KB/s relay for two months. Or a 100 KB/s exit that allows Port 80 (HTTP) traffic. (I do the former).
—–
Privacy is important. Especially for those whose governments are actively stopping their own people from being well-informed. You can make a difference. ^_^

Update: I finally found a link I was looking for, but couldn’t find. I’ve added it to the Hidden Sites discussion.

May 27, 2013

Migrating Email from Google Apps to Office365

So, as you might have gathered from the title, I’ve switched away from Google for my E-Mail, along with Calendar and contacts, I figure a post on how I went about it is in order.
Firstly, my setup, and thus how I picked Office365 to replace Google Apps for Domains.
I have a Mac, and use Mail.app, Contacts.app and Calendar.app on there. I have an iPod Touch and an iPad, where again I use the default apps. My Phone is a Blackberry (OS 7, not 10 – this becomes important later). When using Google to house Email, Contacts and Calendar everything worked, and I linked my Facebook Account on the Blackberry to my Contacts application so my contacts had display pictures and additional contact information.
I only require one account, with a custom domain name. I may add further domain names in future.
The vast majority of my contacts are simply a name and email address. Others are a name and phone number. A scant few have both email and phone – unless they are linked up with Facebook.

Evaluation of Options

I wanted, as I stated before, a hosted solution. A friend recommended Intermedia to me, however they now require a minimum of 3 accounts which makes the initial cost of an account for me at about £18 per month. I also considered running my own contacts and calendar software, and going with a less full featured email service such as FastMail. Software I looked at included OwnCloud and Baikal. However none of the software for Calendar seemed particularly well designed – the web UIs were often lacking or non existant.
Looking, as I mentioned into Microsofts’s own solution. At £5.60/month I get unlimited storage, and a full email, calendar, contacts house. An additional bonus – important for me, is that it includes complimentary access to Blackberry’s Business Cloud Service. BCS is kind of a slightly scaled down version of Blackberry Enterprise Server, that Blackberry host themselves, and is directly integrated in to Office365 if you choose to enable it. This is vital as adding the Office365 account to your Blackberry Internet Service only synchs the Email, and not the calendar or contacts. Both of which are pretty essential for my phone to achieve its purpose.

The Migration

Having selected Office 365’s Hosted Exchange plan, I signed up and paid my first £5.60. The signup was quick and smooth, as was the account creation. When you sign up you initially get to choose a subdomain underneath onmicrosoft.com for your account. You cannot remove or change this subdomain, but it’s free to add your own domains to the solution, and assign users to the correct domain name. Once the account was set up I setup access to the default subdomain account on my devices, and sent an email to check everything could work. This necessitated enabling, and setting up my Blackberry Cloud Services account.

Blackberry Cloud Services

This is the only negative experience I’ve had. This was the first time I got an error – when it tried to load their administration panel. It fixed itself on a refresh, but still. In addition, unlike Office365 it is not a sleek UI. It is very much a business/corporate designed UI. It reminds me very strongly in fact of this comic image. Every navigation element leads to a giant search form with many options. Naturally, in my use case, with only one account and one Blackberry, I’d prefer just to see my user details straight away. That said, my use case is obviously not the primary target for this tool.
Once I’d searched for my user and selected myself from the search result list, I checked through the options given for configuring a Users Profile. On a Blackberry, this profile can control every little thing that the device and users can do. It allows specifications of device passwords, remote wiping, separation of work and personal apps (to the extent of disabling copy-paste between the two domains). Naturally, this web tool only provides a fraction of the power, and the default options were fine in my case. Thus I used the option to send an email with an activation password to my email. On the Blackberry I had to download the ‘Enterprise Activation’ app, and use the password provided to associate my blackberry.
All the testing emails worked fine. So I set about doing the actual migration

Doing the Migration

The first thing was adding my domain to the Office365 account. This was incredibly simple, and contained helpful instructions on updating the DNS to verify ownership, and enable specific services, for a variety of domain name service providers. Having verified my domain, and activated it for my Exchange account I reassigned my user to use my own domain instead of the onmicrosoft.com subdomain. That done I removed Googles Records from my DNS. I then set up the email migration. Again, Office365 had an excellent Wizard that catered for a variety of Exchange-Exchange migration situations, and a generic IMAP importer for other situations. I went with the IMAP route naturally. After providing the domain and IMAP server, I uploaded a CSV containing the details of the accounts to migrate (one in my case), and set the import going. With 27,000 odd messages it took a few hours for my account to fill up – so many messages due to the way GMail handles its tags system over IMAP. Each tag is its own folder, and messages are duplicated across the ‘folders’.
I then used my own Google Account to export my Contacts into a .csv file which I then imported directly into my own account, again without any issue.
The Calendar app however as far as I can see has no built in import function. Trying to use Calendar.app to export my Google Calendar, and reimport it into my Exchange Calendar with its own import function caused an error. As I don’t actually have any future appointments scheduled at the moment, this is bearable. My history is currently only viewable on my Mac as I imported it into a local Calendar. For those with future appointments though that could well be a stumbling block, or possibly a deal breaker). You may be able to use Google Calendar to ‘share’ the calendar to the Exchange one, but I didn’t investigate that option as the loss is minor.
Having imported everything I updated all my sync settings to point to the new domain, and everything was pulled down fine. I ended up with Duplicate Contacts in some instances, but Merging them when they showed up on a device was sufficient. My Blackberry I had to remove the syncs entirely as it was still trying to sync to my old account as well. After wiping out all the contacts locally and re-enabling sync to my new account everything seems to work.
I went to re-enable Facebook-Blackberry integration, only to discover that this option is disabled by Blackberry Business Cloud Service. It’s configurable in the full BES package, but it is one of the things they removed in this not-quite-BES-in-the-cloud version. For now I have simply enabled Facebook integration on my Mac and iPod/iPad. The information added by these isn’t synched up into Office365, so my Phone contacts aren’t Facebook integrated at the moment. Microsoft however do have US only facebook integration directly in Office 365. Hopefully that will come through soon. Either way, Facebook contact integration on the phone is only a nice to have and didn’t really benefit me aside from the display pictures.
I did however enable LinkedIn integration in Office365, which pulled in additional contacts into my contact list.

Summary

Everything went better than expected. Office365’s Exchange component is fast, sleek and very nice to use. The only things that went wrong are a lack of Calendar import; and the Blackberry which had by far the most issues in all aspects of the migration. Something that can be laid firmly at Blackberry’s feet. That said, BBOS10 doesn’t have BES, but instead integrates with Email, Calendar et al over standard protocols (hopefully reducing those issues?).

May 24, 2013

University is over.

So on Wednesday I had my last University exam.
It’s the end of my formal education (barring going for a Masters at some point in the future). Having been in formal education since the age of 4/5, this is quite something. Yes, I had a placement year last year, but for some reason I don’t really count that – I was still a student really. Still getting e-mails from my University. But that stage of my life is at an end now.
*flumps in to a seat*
So what have I done since the end of that exam? First I went into Bath’s centre and kind of wandered in a stupor for a while before going home and reading a book. I’ve played some games, tidied my room up after the garbage that collected during the revision-snacks phase. Watched the movie Night Watch (The subtitles on Disc 2 are excellent), and I have the last book in the series the film is a loose adaption of on my bookshelf waiting to be read. Rewatched the movie Adams Family Values. Looked in to alternative email/contact/calendar software. Email is one thing I don’t want to have to host myself – dealing with blacklists, spam and security is not something I want to do just to keep my email working. I’m thinking of perhaps using Microsofts Hosted Exchange offering. It’s cheap, but still paid (thus I’m the customer, and not the data). It’s Exchange, so it’s feature rich. And being Microsoft and Enterprisey it doesn’t go through upheavals with social media integration. Sure, they might work just as much with Law Enforcement as Google do – but at least they don’t have my search history as well.
A final thing I’m going to do over the summer is code myself up a private journal software. Yes, I have WordPress. And I could use that, but it doesn’t quite do what I want. Given my quality skills with UI design, I’m sure that this will look all modern and that </sarcasm>. Anyway, here’s a thumb sketch of my requirements:

Web access
One entry per day. You can extend, but after say a week you can’t edit them anymore.
With specific sections for
Interesting Links browsed / found
Interesting stories read
A way to pull in specific Tweets/Facebook/Tumblr posts.

So the way I’m kind of imagining it is a kind of digitally integrated private snapshot of my life day by day. It’ll be interesting to do this. Maybe I can use it as a chance to finally learn JQuery and AJAX and all the web 2.0 goodness? I don’t know. I’ve wanted to get around to learning how to AJAX and JQuery and that, but I’ve never found a tutorial that actually made it click. But anyway, this might be my project over the summer. 4 months of a clear diary is somewhat daunting. Although it will likely be the last time I have this amount of time free.
Well, this turned out to be somewhat rambly. Ooops.