Sort of interesting just to hear about the ups and downs of companies like dubsmash. They were often cited as an example of Berlin's future as a startup city [1]. They went from 35+ employees to 27 [2] to now 12 as they've stated in this post. They also moved from Berlin to New York, which seems to imply they felt like the city couldn't offer what they currently need. It looks like in the process of moving they didn't take that many of the employees with them (maybe this was also a way out of strict German employment rules?) Seems like a bit of an attempt at a restart (co-founder Roland Grenke seems to be gone, etc).
Looking at their rank history at AppAnnie, they were doing really well in 2015 but it's been downhill from there (from top 10 to >500 in all the major App store charts). How they were able to go from 140M to 350M downloads in the last year (compare this article with the techcrunch one) is a complete mystery. Also, stating your number of users without any qualifier (e.g. MAU) in a tech article is a bit of a red flag, in my experience that usually means that it's a vanity number (yearly active? Who knows).
It also sounds odd that they have 3 engineers and 12 employees. What do the other people do?
And hopefully they had more than 3 engineers back when they had 35 employees...but even then why would they choose to fire engineers and have that tech to non tech ratio?
Dubsmash relies heavily on copyrighted content from big studios (at least it did when it became popular). I guess most staff works hand in hand with media companies to promote their content inside the app.
Sounds like that deserves a bit of caution. They were "hiring like crazy" not long ago in Berlin, I guess those jobs would have lasted less than a year:
interesting, they even couldn't manage a basic signup process. app says my email is in the users list. when i try to login, it says no email is exist in their system. tried forgot option same. ok, hire someone to manage your acl part guys ,_,
Agreed, that part was lengthy as well, seemed to be way too many steps, but with the amount of users they've acquired that process doesn't seem to be an issue.
Thought the same. My wild guess is that those engineers are at their career peek in terms of energy / ability to deliver glue code, but a few years behind getting to be a well-rounded engineer that can live/work sustainably.
Three engineers maintain code in Java, Swift, previously Objective-C, Go, Python (both Django and Flask), Node.JS, considering Kotlin, and additionally make use of Celery, RabbitMQ, React, Redux, Apollo, GraphQL, Postgres, Heroku, AWS, Jenkins, Kubernetes, Redis, DynamoDB, Elasticsearch, Algolia, Memcached, and more.
I might be an inexperienced engineer by comparison, but I'll be honest, that sounds absolutely fucking insane. These three people must be geniuses to be able to use all of that with sufficient mastery to effectively handle 200M users.
Sometimes I wonder if there are any internet companies (startup or otherwise) that do customer support. With numbers like that, it's hard to imagine one of those users getting even one second of attention with any problems they might have.
You can only really do customer support if it makes financial sense, which it won't unless you make a significant amount of money on your average customer. Tech companies that don't have sales, but instead take their revenue through ads or through selling data are making cents per customer. With average profit that low, even 1/1000 customers making use of your support for 5 minutes would destroy any chance of profit.
> We since have moved to a multi-way handshake-like upload process that uses signed URLs vendored to the clients upon request so they can upload the files directly to S3.
How does this work in practice / where can one learn more about this?
I want to make sure that I understand the security aspect of this.
You can argue that the user can upload anything using the original api anyway. But in the original case you can do server-side validation before the upload is proxied. I am thinking stuff that are domain specific like only allowing videos that are 6 seconds long or something.
You can move the validation to the client but the client can be easily modified. An actual user might not do this but someone trying steal your storage space (for serving malware or something) might?
These signed urls also seem to expire based on time so you can potentially save the url and upload again later if you allow generous expiration. (again, not really something I see being a huge problem)
But I guess these aren't really serious issues compared to the cost savings. Am I missing other ways this can be exploited?
You would use two buckets in this case. Input bucket gets consumed by worker processes to do the transcoding (and validation) and then they upload into the output bucket. The output bucket is what you serve to clients (hopefully with a CDN in front).
This is more complicated than I imagined so I am not sure the cost saving will still work out (factoring in development time and extra code maintenance cost).
No, I don’t, sorry. What I can promise you is that you’ll thank yourself for implementing it! There is hardly any additional complexity here because you’d probably be uploading the derived content somewhere anyway. Now you’re just putting in a different place than the source.
You can use whatever queue you’re comfortable with so long as you can pipe the upload events from the bucket into it. The pattern I’m outlining is just a physical separation of buckets to make access control much harder to screw up.
I can comment on using pub/sub - it's an immensely useful abstraction for these kinds of tasks and something that is quite difficult to implement yourself with the same level of guarantees that using the cloud service will provide. Any time you need to pass information or trigger events asynchronously messaging is the first choice IMO.
Not 100% sure what they mean by _vendored_ here, but I'm guessing they make a request to one of their backends to generate the URL and return it to the client for use.
One thing to keep in mind, users should be able to upload (to the specific signed URL), they should not be able to download from that location. Don't make the files users can upload publicly downloadable, otherwise you can be used to host malware. After the video/image is uploaded, you need to download and process it[1], then upload it to an S3 bucket that allows download (e.g, via CDN).
[1] Use caution when processing user content. It is best to process media in a sandbox that can protect you against exploits in the media processing libraries.
Client makes request to server passing back auth token, server verifies auth token and uses the S3 library to generate a unique 1 time use URL for upload directly to the client. Client makes a put request to the s3 url. After it's finished s3 revokes the URL.
Multipart signed upload is much harder and requires signing every chunk.
Just google s3 signed upload there are a few tutorials from Amazon.
> However, we discovered after some time that the custom Python implementation for those workers was dropping up to 5% of the events. This was mostly due to the nature of how reading happens with Kinesis: every stream has multiple shards (ours up to 50!) and each reading client would use a so-called shard iterator to keep track of where it was reading last. Since the used machines could always crash, be recycled, or scaled down, we needed to save those shard iterators in some serialized format to Redis and share them across machines and process boundaries. Since we had so many shards, every once in awhile we would skip events and hence lose them.
I've never worked with Kinesis, but in Kafka you'd store offsets specifically to solve this issue. When one of the members of a consumer group would drop out, the partition (read: shard) would automatically be reassigned to another member. This gives an at least once delivery guarantee, combined with idempotent actions gives effectively once semantics. No need to loose any messages. What was the issue that the dubsmash engineers were solving here?
Home-rolling a checkpoint-free event pipeline is a rookie mistake; it's a pity they didn't come across our Snowplow project (Apache 2.0 event pipeline running on Kinesis, Kafka and NSQ, https://github.com/snowplow/snowplow/).
> Although we were using Elasticsearch in the beginning to power our in-app search, we moved this part of our processing over to Algolia a couple of months ago;
Running an alternative solution with similar availability, performance and relevance will in most cases be substantially more expensive though. It really depends on your use case.
I am genuinely curious about the trade-offs, as the bad and the ugly are not mentioned. Being realistic, there are too many moving pieces there, and yet the team of 3 remains experimental?
[1] http://www.wired.co.uk/article/european-startups-2016-berlin [2] https://techcrunch.com/2016/11/30/dubsmash-9m/