- fetching embedded content based on a URL (images, movies, css, javascript, ...)
- Hidden frames and/or WebRTC - can scan your local network
There are probably more.
[Edited: Updated to clarify that local network scanning can be done with hidden frames or WebRTC. A followup comment from me gives a public example of how]
It is my impression that most of the tracking happens through "tracking pixels" or 1x1 images, usually with 0 alpha so they're transparent... and cookies.
Not sure if this is what grandparent is referring to, but DNS rebinding[0] is what springs to mind.
It's simultaneously kind of smart and also really stupid. Basically, you give a valid 3rd-party domain multiple IPs, one of them normal and one of them resolving to a local IP. Then you cut off the normal one and the browser just allows you to make calls to whatever local interface you want.
There was a good defcon video about this a while back[1]. It's a much bigger problem than most people realize. This is why it's good practice to have at least some security around devices even if they're only connected to your LAN.
Evercookie[0] was state of the art in 2009. That's centuries ago in Internet time - and it doesn't even include fingerprinting. Cache and cookies are just a small part.
- browsers cache results
- cookies
If you do neither of these things trackers become much harder to implement.