1. Millions lines of messy Java code with annotations and dependency-injection form a multi-shard distributed job with tens of distributed downstream backends, some of which provide fake always-success synchronous RPCs hiding the fact that the underlying operations are actually asynchronous and may fail very often. What makes it worse to work on these code is it was built with a single transactional underlying database but then due to reliability issues of the single database, data are now across at least three different transactional databases. This causes endless race conditions and concurrency issues to fire everywhere. The production release of this distributed job used to be once every week, now multi-months is normal, and quarter rollback is not a surprise to people.
2. Almost million lines of messy C++ code with several .cc/.cpp files containing tens of thousands C++ code, some class implementations are across multiple .cc/.cpp files. I have always been scared when touching some of these giant .cc/.cpp files. People who have been working on the code for years can still easily make ignorant mistakes when adding/modifying a small feature (with so-called fully unit-test overage of course). There are multi-million lines of testing code, which is almost 10x of the code be tested, but most of them are bogus and almost test nothing, hence silly mistakes are everywhere, everyday. Even the original author of the code base needs 5 follow-up fixes in order to make a 10-line behavior change work.
For both code bases and the jobs running these code, people are now talking about breaking them into microservices, by converting function calls in the existing code bases into RPCs. I can foresee a tremendous number of service outages are coming...
1. Millions lines of messy Java code with annotations and dependency-injection form a multi-shard distributed job with tens of distributed downstream backends, some of which provide fake always-success synchronous RPCs hiding the fact that the underlying operations are actually asynchronous and may fail very often. What makes it worse to work on these code is it was built with a single transactional underlying database but then due to reliability issues of the single database, data are now across at least three different transactional databases. This causes endless race conditions and concurrency issues to fire everywhere. The production release of this distributed job used to be once every week, now multi-months is normal, and quarter rollback is not a surprise to people.
2. Almost million lines of messy C++ code with several .cc/.cpp files containing tens of thousands C++ code, some class implementations are across multiple .cc/.cpp files. I have always been scared when touching some of these giant .cc/.cpp files. People who have been working on the code for years can still easily make ignorant mistakes when adding/modifying a small feature (with so-called fully unit-test overage of course). There are multi-million lines of testing code, which is almost 10x of the code be tested, but most of them are bogus and almost test nothing, hence silly mistakes are everywhere, everyday. Even the original author of the code base needs 5 follow-up fixes in order to make a 10-line behavior change work.
For both code bases and the jobs running these code, people are now talking about breaking them into microservices, by converting function calls in the existing code bases into RPCs. I can foresee a tremendous number of service outages are coming...