General question about semiconductors: Why is there so much emphasis on the density of transistors rather than purely on the costs of production (compute/$)? CPUs aren't particularly large. My computer's CPU may be just a few tablespoons in volume. Hence, is compute less useful if it's spread out (e.g., due to communication speeds)?
That's only if you needed a signal to cross the whole chip in one cycle. There's no such limitation preventing a 1 foot wide chip from being filled with 5ghz cores on an appropriate ring bus.
You can make X lower by reducing the frequency (= having each cycle be longer)
But apart for that, the main reason big chips would clock slower is power, not timing. If you have a lot of transistors all switching on a high voltage so that the frequency is high, you get molten metal and the magic smoke leaves.
Big chips aren't one big stage where light travels from one side to the other. But they are giant weaves of heating elements that can't all run fast all of the time
cache latency is definitely part of what limits core clock. you're not going to have a good time if your L1 latency is, say 10 clocks. not to mention the fact that register files are not much different than SRAM (therefore cache-like).
You could always purchase a multi CPU system (effectively what you're suggesting) from several years ago for much cheaper than modern hardware. If you're using it regularly though, the electrical cost will eventually eat away any money savings vs the same computational power in a modern single CPU.
With the way Solar/wind + batteries are bringing electricity prices down the cost per compute will still come down even as moore laws slows down. Looking at current trends running today processors 10 years from now could just cost in electricity just 10-12% what it costs now.
A factory makes transistors ,and if you increase a 'node', you make twice as much. If you do an amazing job, you might reduce cost 10%.
So by far the best way to maximize value in semiconductors is to enable shrink.
But you also just don't hear it in the popular or even engineering press. Most manufacturers and designers look at a PPAC curve (power, performance, area, cost) and find optimal design points.
As for spreading it out: the unit of production isn't a wafer, it is a lithographic field, which is roughly 25*35mm. You cant practically 'speead out' much more (ok, you sort of can with field stitching, but that is really expensive).
Because when you make it denser, you can cut the CPU into smaller parts, which decreases costs
when you make it less dense, it can clock up higher, but you will have fewer cores per mm^2
AMD went with both approaches, where their hybrid CPU will have densely packed low speed Zen 4C cores and some high speed Zen 4 cores to boost at the highest frequency
Increasing density has caused chip cost per FLOP/s to decrease exponentially over the last decades. But nowadays the price per transistor doesn't go down as fast with increased density like it used to.
E.g. new Nvidia GPUs are getting smaller for the same price, which means they are getting more expensive for the same size. At some point, the price per transistor will actually increase. Then Moore's Law (the exponential increase in transistor density) will probably stop, simply because it's not economical to produce slower chips for the same price. (Maybe the increased power efficiency will still make density scaling worth it for a little while longer, but probably not a lot longer.)
> this is because NVidia changed their pricing strategy to decouple it from that
Because neither AMD nor Intel can come withing striking distance of Nvidia's flagships, and seeing how their silicone flies off the shelves, they have also adjusted their pricing to match their relative performance to Nvidia.
In addition to the answers already given, there are defects during the process that are more likely to render your chip useless the larger your chip is. This is true for smaller chips as well, and often the design handles a defunct component, but you prefer minimizing defects per chip.
Density is one of the main ways to get cost savings. But there are others too, and there's also a lot of hype around them. Chiplets for example. Or CXL for memory.
Personal usage still relies on fast single threaded performance. As far as business usage, the cost is primarily energy which requires smaller node size for the same performance.
Because you are assuming there is an objectively optimal processor design for a specific manufacturing process.
If you don't constrain the chip to a specific design then what is going to count as compute? The number of adders or multipliers? That is just a different way of talking about transistor density.
TOF latency isn't that much of a big deal, though driving a signal for distance consumes a lot of power, and power has been the primary design-limiter for at least a decade.
But I think the GP's point is that heat is far easier managed when spread out over a larger area, so why all the emphasis on ultra tiny transistors vs just making a chip that's two inches by two inches or something?
And I think the main answer to that comes when you look at some of the discourse around Apple's M-series chips, that doing a larger-die design is just way riskier: there are huge implications on cost, yield, flexibility, etc, so it was really something that Apple was uniquely positioned to move aggressively on vs a player like Qualcomm who needs to be way more conservative in what they try to sell to their main customers (phone OEMs like Samsung).
Every chip manufacturer does that, that's how they come up with cheap, low end parts. They just try to keep number of cores an even number, so the trick is less obvious.