Friday, May 3
May. 3rd, 2024 10:19 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
No surprises on the train this time. I sat on the chair and read another paper about Argentina’s welfare offices. From the same author. He just goes there and describes everything in detail. And then analyzes.
Then in the office, In the morning, I made some changes to the screens API: don’t include trains arriving later than 45 minutes. At night the headways are 20 minutes. This will show the next two trains. No need to scroll the screen with the trains for the next two hours. That is what Sunny told me to do. So many nuances – so cool.
Another thing I added was to limit the number of trains returned. I filter the trains; I sort them by estimated arrival time; I take top K. Is it O(N)
or O(N log N)
? I suspect the latter. But perhaps it is smart enough to consider both sortBy
and take
and not sort everything filtered? I don't know yet.
trains .asSequence() .filter { // apply filter currentStation in it.stops } .sortBy { it.estimatedTime } .take(2)
Then I got a new task. I feel like I'm in the Disneyland of cloud infrastructure: every day I get a new task learning a part of an interesting piece. This time I looked into Google Cloud. The task: the costs rose N times in the past months – what's the hell with that? It feels like a game, a puzzle. I quickly discovered this was a fraud scheme called SMS pumping. Bots generate login requests via SMS. SMS goes to the countries where the scammers take a cut from the network carriers making money from receiving the SMS.
I went to Google Cloud metrics. I figured out that most of the spending was on the SMS to Togo and Sri Lanka. The proper way to measure this would be to look at the ratio of login requests (that send SMS) and successful logins. Then I fought Google Metrics to compute these metrics over the last couple of weeks. What the fuck is this system. It has a whole new language for querying the metrics, and nothing really useful. It shows the sent SMS count; it shows successful signups. Divide them by one another over the window of two weeks – nope, it shows me per-second rates instead. As their tutorial popups kept showing, I realized this metrics analytics system was complete crap.
I told Will. He said: "i defer to you on recommendations for moving forward". I thought – alright; what would I really do if it was my system? I think I should just go ban Togo and Sri Lanka phone numbers for a start. This is New York local service – those are the top spenders from foreign countries; it's obvious they are the main source of the fraud. Then, we can re-evaluate the long-term strategy. I told Will. He took a pause, then said that he was "not comfortable blocking African and Middle Eastern countries" just because there were too many login requests coming from their area code. I realized I forgot to tell him the main part – that it was the common fraud; with a link. That was kind of funny. But yeah; maybe I was too quick. It's not my own service, after all; we make a public service in a large metropolitan area. I should think twice before giving such recommendations.
Then Will told me about the accounts dataset in BigTable, and I wrote a script in Kotlin Notebooks joining that with the billing costs, and then tried to think of some more arguments of when the benefits of the country phone code go below the cost of the fraud. And I thought that was the topic to think about for Monday, or maybe to puzzle it over the weekend.
That was pretty engaging; I realized it was almost 7. B texted. I went back home. We cooked soba together, with salmon, cherry tomatoes, cucumber, and sauerkraut.