"A company like Microsoft [...] would never deem manually binary editing as acceptable." is complete bullshit speculation. The guy who has analysed the bug report must be good at assembly language. Maybe he has developed the binary patch very quickly as a proof of his analyse. When this analyse was hand over to development team, they had two options: spending time to setup the development environment with the correct version of sources and tools (error prone), performs again the development using c++, launching all the non regression tests. The other option is to do nothing and bet this binary will never need again to evolve. This is cheaper and the risks seem smaller. IMHO, this beautiful patch shows that Microsoft is open minded and choose smart options instead of bureaucratic ones. I applaud.
Came here to write the exact same thing. "The only way this happened is if Microsoft somehow lost the source code of a long forgotten Office component." is completely false speculation/generalization. It is not the only way.
More than likely it was a project that only builds in a long-discontinued non-conforming C/C++ compiler that only runs on Windows 95. When I was working at MS very recently I was surprised to learn that all software affected by the Sun lawsuit, way back in 2001, was still strictly verboten, even internally. All builds of Windows 95 through XP SP1 were absent from the internal builds and products shares just as they are from MSDN. I don't believe Microsoft would have somehow lost this source code as the article speculates - more likely they reasoned that the effort needed to get a build environment set-up and working would take more time than finding someone on the team who really knows x86 assembly well enough to manually fix the bug - and there are plenty of people like that on campus.
A comment below says that Microsoft didn't write it (but Design Science). Maybe the C++ patch was rejected because the source code did not follow Microsoft coding rules mandating a complete rewrite of the code. To avoid a huge development cost, one rule has to be violated: either the binary patch or bad code. We can speculate many reasonable scenarios to explain what happened. I do not see any shame on Microsoft.
In your scenario, Microsoft is making a probabilistic statement ("a bet") that there will not be many future changes to the code.
That is, the binary patch path is more expensive than code patches, because it requires more specialized skills. At some point, with enough patches, it would have been worthwhile to update the source and rebuild than to modify the binary.
Originally it was a fix for a bug that would show up when running it under Wine. Now the game is re-released on Steam and GoG, and same thing came up on regular Win10.
The funniest part is, I don't even remember what the patch does - I made it 12 years ago! I remember spending a lot of time reading disassembly, trying to figure out what's going on, and eventually figuring just enough to plug the immediate hole. Then I posted the binary diff on a forum, and subsequently forgot all about it, because the bug went away in some Wine update shortly thereafter.
When I dug that diff out, it made zero sense to me. So now not only there's no source for the game, but I've "lost the source" for the patch, as well - I'd have to reverse engineer it all again to figure out what it actually does. ~
While we're talking games, kyrub has made a splendid patch called "insecticide" for Master of Magic. The game famous for its vision, imagination and creativity but notorious for piles of bugs. The first few patches had several pages of bugfixes. Insecticide added many more pages, GUI improvements and AI tweaks.
Some bugs went unnoticed until very recently. For example I was one of playtesters of this patch, and I reported that Invulnerability spell appears impossible to dispel. Kyrub checked it out, and... would you believe ? There was an off-by-one error, Invulnerability was the last on the list. It was literally impossible to dispel because of a programming error. Those old strategy guides recommending Guardian Spirit + Invulnerability rushes... they were so effective because they unknowingly exploited a bug!!
I used to do this a lot with shareware using a tool called SoftICE. If you ever saw a message box along the lines of "This feature is only available in the paid version...", it was easy enough to pop the debugger, work backwards to see the conditional that causes the message, change to a nop/edit the value, put the instruction pointer back, then continue on with the paid feature. I miss SoftICE.
I don't think it was deliberate, to be honest. The bug is really in the game code, but it only manifests when a DirectDraw surface is allocated in just the right way, if I remember the gist of it correctly. Wine likely didn't intentionally fix it - they just changed something about their implementation that accidentally made it go away, and Win10 also changed something that made it repro. I don't actually know if Wine has it again now - it might, I haven't tested that game under it in ages.
I'm having a senior moment here, but wasn't it common to include space in instruction-space and initialized data space, designed to permit binary patching by jump table to (currently) un-used pre-assigned regions?
I thought I'd even seen reference to support in compilers to pre-assign such spaces.
If (for instance) this is a product acquired from outside the Redmond campus, run successfully unpatched since B.C. and now needing work, it might actually be more effective to do what they did instead of reverse-compiling code, re-implementing, the whole shebang.
Security nightmare? Sure. Unsafe? depends. NASA pay old-timers the big bucks to keep things working out in space, which probably involves processes which aren't morally far removed from this.
> I'm having a senior moment here, but wasn't it common to include space in instruction-space and initialized data space, designed to permit binary patching by jump table to (currently) un-used pre-assigned regions?
Are you referring to Microsoft's MOV EDI, EDI at the beginning of every function? That's a hot-patch point.[1]. I think that was more for in-memory patching to get around the problem that if you replace the current code, you might run afoul of some thread actually executing it. That wouldn't necessarily apply to the binary/DLL, which when loaded from scratch wouldn't need any such hot-patching.
> I'm having a senior moment here, but wasn't it common to include space in instruction-space and initialized data space, designed to permit binary patching by jump table to (currently) un-used pre-assigned regions
Maybe you're thinking about hot patch points in Windows DLLs?
Contrary to the tone of the article, I wish more of the small and very-likely-security-related patches would come in the form of "actual patches" like this --- and also distributed as actual diffs instead of whole-file-replacements, since this greatly improves the user experience; those of you who have experienced Windows Update taking tens of minutes to download and apply updates may be disappointed and irritated to realise that even if the actual changes are tiny, as in this case, applying the update involves downloading the entire file again. If there are 10 updates which affect a total of 5 KB in a 20MB file, for each one that's installed, you end up redownloading the whole thing, or 200MB of bandwidth used (wasted) to make 5KB of changes. It's not even smart enough to jump to and download just the latest version of all the selected updates.
Another advantage of the small and localised changes, besides rapid and efficient application (and reversion), is the ability to do it to all in-memory copies of the file in addition to the one on disk, eliminating the need to reboot.
A company like Microsoft that has solid and complex software development and security practices in place would never deem manually binary editing as acceptable.
Perhaps what we need is better tooling to make this process easier, and I guess in general a more "bidirectional" view of binaries vs. source --- small changes in source should correspond to proportionally small changes in binaries, and also proportionally small updates for end-users.
The current "binaries are sacred and shall only be the outputs of long and complex build processes" attitude is neither efficient nor optimal from the perspective of the end-user experience, or even developers sometimes --- for example, changing a one-letter typo in a string constant and testing the change should not involve hours of compilation, deployment, and installation, but be closer to the minutes it would take to do it in a hex editor.
Apart from the downloading time, which varies depending on the connection - the vast majority of time during a windows update is spend calculating exactly what to update. Sitting on a 1gbps connection with a state of the art SSD, and it still takes a long time to update.
More likely because Microsoft knows that a lot of computers don't have that much free space, so only sending you what you need means using less of your disk space. Also, users often have data caps, so keeping the amount of downloaded small helps them there as well.
No, it's not about a trade off. The system needs to figure out what to update, and if it is safe to do. There was an article from MS some time ago on the topic. Most of it has to do with shared dlls I believe, but the details are a bit hazy.
I don't think so. Compilers may change the output binary drastically, due to optimisations and other things, despite the source changes being small and localised. I'd argue that such behaviour, while desirable when creating a new binary, is counterproductive when trying to make a "minimal diff" of an existing one to fix something and not make other unnecessary changes at the same time.
Can you imagine the panic they must have felt when they realized that there was security flaw in code that hadn't been compiled in 17 years, that they didn't even write, they couldn't find... and they were responsible for fixing?
Why would they panic over 17 years old code? Even installing Windows 10 a week ago it came with ALL CAPS long disclaimer that they are not responsible for any damages to hardware, software caused by this installation.
I bet getting this to compile was more of a headache than manually patching it. This component - being old itself - probably has tons of dependencies on even older libraries. Code in its teens would need a lot of cleanup to be built (and work) with a modern optimizing compiler and recreating ancient build environment isn't simple either. They did the cost/benefit analysis before fixing a bug.
Microsoft didn't write it, it is licensed from Design Science (it is just a light weight version of MathType), so it is likely they never had the source code.
Building it was my first thought. Microsoft has copies of their old compilers, but you need a machine that can host it and the compiler. In the end, it could have been easier to just manually patch it. I’m not the guy for the job, but I’ve worked with those that could likely do it, and correctly and quickly enough that it would be a valid option.
If it's a single component, why don't they just rewrite it ? A company like Microsoft should have the resources. It might be a lot of work, but can they be sure there are no more bugs ? Is keeping a rotting, black box binary responsible ?
Is keeping a rotting, black box binary responsible ?
Is introducing a whole new set of bugs responsible? After seventeen years, if there are any skeletons left they're going to be hard to find. And I'll betcha there's an individual (at a minimum) at Microsoft sitting there right now beating the hell out of that component with all manner of fuzzing and stack manipulation trying to tease out anything else that might have been missed.
Or we can rewrite it and hope we caught all of those weird edge cases, as well as replicated any bugs that downstream components might have come to rely upon. In the mean time, make sure not to introduce any new bugs or pull in any dependencies that might have new bugs. And make sure to test it on every version of Office for the last seventeen years. We still have those tests for Office 2000, right?
Compatibility. Office doesn't even use that binary anymore, but still ships it to avoid breaking other apps that depend on it. They wouldn't risk a rewrite.
Confirmed. Sometimes the Office team didn't even write the component, but has a dependency on it. When I was on the FoxPro team, Windows decided that it didn't need to ship the Fox ODBC driver anymore because they didn't think anything was using it (use OLE DB and ADO instead; I think they were kind of cleaning house, too), except those things that already shipped with driver. Someone forgot to tell Office, Office stuff started blowing up, and thankfully it was caught in an internal build. Office engineers probably didn't even know who we were, let alone have source code, and Microsoft was a hell of a long way from any kind of standardized build system that another group could understand. Thus it ended up on our plate.
So I whipped up a dummy driver that Windows could ship that just NOP'ed every ODBC call with a text box that said "go download the Fox driver from...". For all I know, it still ships in Windows today (VFPODBC.DLL, IIRC).
But even that simple task wasn't simple when you're working at Windows scale. First was even getting it checked in, and check-ins at the time were more tribal knowledge than documented. A helpful Windows engineer documented what they knew. But it turns out that there were a lot of assumptions made that were valid if one were on the Windows team, but not if you're an external group. And thus is how the Windows build got broken. Breaking the Windows build is a little different than it might be on your team when the build gets broken. Yeah, maybe you give someone some shit, maybe you make them wear the "I Broke the Build" shirt for a day (don't do that, BTW). But when the Windows build gets broken, emails go out, and war rooms with VPs happen. And that's how I got to meet Chris Jones. He asked why Windows had 1200 testers sitting around doing nothing. Fingers begin to point at the strangers in the room. We said that we followed the directions given to the letter, but there was no formal process for Windows to take check-ins from external groups, so here we are. We were dismissed, and a few weeks later a document started going around on how to check-in to the Windows repo. It was probably a month or more from the time I wrote the dummy (which probably took all of an hour or two) until it showed up in internal builds.
So, yeah, I can kind of see why there might not have been any volunteers to go modify and rebuild a seventeen year old component. Put a two byte patch on the thing and call it a day.
One way or another it's MS failure. Either they don't have a proper environment to build every artifact from its source or they just lost sources. That's what happens when you're trying to cut the costs.