I want to make sure that I understand the security aspect of this.
You can argue that the user can upload anything using the original api anyway. But in the original case you can do server-side validation before the upload is proxied. I am thinking stuff that are domain specific like only allowing videos that are 6 seconds long or something.
You can move the validation to the client but the client can be easily modified. An actual user might not do this but someone trying steal your storage space (for serving malware or something) might?
These signed urls also seem to expire based on time so you can potentially save the url and upload again later if you allow generous expiration. (again, not really something I see being a huge problem)
But I guess these aren't really serious issues compared to the cost savings. Am I missing other ways this can be exploited?
You would use two buckets in this case. Input bucket gets consumed by worker processes to do the transcoding (and validation) and then they upload into the output bucket. The output bucket is what you serve to clients (hopefully with a CDN in front).
This is more complicated than I imagined so I am not sure the cost saving will still work out (factoring in development time and extra code maintenance cost).
No, I don’t, sorry. What I can promise you is that you’ll thank yourself for implementing it! There is hardly any additional complexity here because you’d probably be uploading the derived content somewhere anyway. Now you’re just putting in a different place than the source.
You can use whatever queue you’re comfortable with so long as you can pipe the upload events from the bucket into it. The pattern I’m outlining is just a physical separation of buckets to make access control much harder to screw up.
I can comment on using pub/sub - it's an immensely useful abstraction for these kinds of tasks and something that is quite difficult to implement yourself with the same level of guarantees that using the cloud service will provide. Any time you need to pass information or trigger events asynchronously messaging is the first choice IMO.
You can argue that the user can upload anything using the original api anyway. But in the original case you can do server-side validation before the upload is proxied. I am thinking stuff that are domain specific like only allowing videos that are 6 seconds long or something.
You can move the validation to the client but the client can be easily modified. An actual user might not do this but someone trying steal your storage space (for serving malware or something) might?
These signed urls also seem to expire based on time so you can potentially save the url and upload again later if you allow generous expiration. (again, not really something I see being a huge problem)
But I guess these aren't really serious issues compared to the cost savings. Am I missing other ways this can be exploited?
I am looking into the GCS version, not S3, if that matters: https://cloud.google.com/storage/docs/access-control/signed-...