Figured I should make a Devski post. It's been quite a while since the last one.

I haven't really been less transparent, I just haven't accomplished much.

My life got really busy about six months ago, between a new job, house church, and tai chi classes.

About a month ago, I finished up the code to convert and load user data from SQL. If you remember some instability early one week last month, that was why. Despite being far less data, the user data has about twice the database columns and tables as publications, and proportionally, we saw about twice the issues pop up. Unfortunately, I knew I wouldn't have time to fix them, so I ended up reverting the changes.

The biggest issues we saw involved the hubwheel dots, and the number of people who actually shared the post, not being the same. This was caused by vote data being duplicated in like four places, and the SQL conversion not saving in all the duplicate places correctly and atomically.

If you're beyond, like, a sophomore computer science student, you know duplicate data is capital-E Evil. This kind of thing is why it takes a considerable effort for me not to criticise the Hacker News source (from which Hubski was forked), its language, and the author of both.

So, I reverted the users-in-SQL work I spent several months of free time on, and started working on removing the duplicate vote data. Fortunately, one of those duplicate places was publications, which we converted to SQL last Summer.

I've been at OSCON all week, so I spent most of the last five evenings doing that. It's pretty much de-duplicated now. The only duplication left is 'vote' data, versus 'shared by' data. But all the 'votes' are in one place, as are the 'shares', and just this evening I changed publication 'score' to pull from the 'shared' data. Votes and shares aren't quite the same, so it's going to be a lot more work to combine them, and I don't think it's as big an issue.

If you noticed some slowness in feeds earlier this evening (Thursday), that was me. I made the score load immediately from SQL when needed, and pushed it live. Turns out, the score was being loaded unnecessarily often (like so much other data). I saw the slowness, figured out what functions were being called too much, and did some higher-order-function magic to fix it.

So yeah, lots of vote/share de-duplication. Should be mostly done now.

Next on my list is moving password hashes into SQL, and API app code to log in and create tokens. API logins will then let us make private user data API endpoints, e.g. for a user's personal feed. So, with logins, we'll be able to add API endpoints for all data, one at a time.

Somewhere in the middle of that, I might try to apply the SQL user data migration again, if I'm at a point where I know I'll have time to fix the issues that pop up.

As always, questions welcome.

ButterflyEffect:

Don't say this nearly enough, I appreciate your efforts rob. Thanks for putting in so much effort to keep this place going and improving.

posted 2896 days ago