First off, we would like to officially welcome rob05c to team Hubski!
rob05c brings programming firepower to the team at a much needed time. Like all of our team members, rob05c was first a hubskier, and it is clear that he gets what we are doing. Speaking for myself, I knew that rob05c was a great pick when he said that he could only join us if we open-sourced the code.
So yes, we are open-sourcing the code, and that will happen once we have made sure that there are no vulnerabilities exposed in the release.
rob05c is already at work, and he is also leaning on us to clean up our workflow. :)
All in all, rob05c has already been a great addition to the team. Feel free to send him all bug reports. :D
Welcome, rob05c!
Regarding updates, a couple of days ago flagamuffin inquired about some information that used to be available on a tag page. Previously, you could see the users that most commonly used that tag.
We have brought that back, plus a little bit more.
Clicking on 'more' on a tag page will reveal some additional information about the tag.
It's likely that this functionality could be incorporated elsewhere on the site in time.
As always, feedback is much appreciated.
Welcome, rob05c!
Some words of caution: Reddit went open-source in 2008... except for their spam measures. This led to Webtoid and a couple others, all of which were completely eaten by spam within weeks. Rather than go full open all at once, it might be useful to open modules or portions and then expand as you gain confidence in stomping out exploits. It might also be handy to add a "@dev@" user that y'all can follow. Is that the case with hubski? As it is, I usually bug mk and thenewgreen and if I remember forwardslash and insomniasexx but now that there's five of you it'd be handy to ping the lot of you with site-related questions that we see in general discourse. Welcome, Rob. you're a good man.
Thanks. That's a good idea, but I'm not sure hubski is really modular enough (yet). We could publish sets of functions, but I doubt anyone would do more than glance at it, until we release enough to run the app. Which is basically everything. Hubski is only like 15k lines of code. A tag might be better suited than a user. I think we all follow #bugski. Maybe we should complement that with #devski or something. But yeah, for our side of posting things, I think hubski serves that purpose.it might be useful to open modules or portions and then expand as you gain confidence in stomping out exploits.
It might also be handy to add a "dev" user that y'all can follow
I deal with spam over hundreds of servers in the US, Iceland, Amsterdam, and Singapore as part of my job as a sysadmin. If you guys run into trouble I'd be happy to have a conversation on things we have found work best for our users (Barracuda systems is also located in A2 I'm sure there are people there we can get on the hubski train if needed)
So,how would you be dealing with spam? Sounds like a good idea.A tag might be better suited than a user. I think we all follow #bugski. Maybe we should complement that with #devski or something.
Reddit. Gimme a sec, i'll find the comment virus.
I can only assume "MD5" is something to do with Markdown and not the hash function. Otherwise, what the fuck am I reading?To prevent double escaping of certain characters, they are run through MD5 after being escaped once, and then the MD5 is undone at the end. Since the MD5 is the same every time, someone figured out that if you just put the MD5 into your comment, it would be unescaped at the end.
Wait hang on, I'm all confused. Is there a reason you can't just take an input, sanitize it, then lex+parse it like you would any other compiled thing (which Markdown is, markdown -> HTML), and spit out the HTML at the end? What's with this MD5 to escape some characters to "prevent double escaping"? I have admittedly not looked much into markdown implementations, especially for forum-like sites like Reddit.
Sanatizing it often causes its own problems including losing things like spaces, or having a ridiculous regex that depends on knowing what the user intends on inputting. For example are we going to sanitize for "? Well what about “ or ❝ ? The solution you've come up with is the old programming problem:
' a programmer has a problem that they solve with regex, now they have 2 problems '
I don't necessarily agree with their solution but it can be easy to see how they came to it. 🐐 here is an unsanitized goat.
Just the bare minimum, basically anything that would come out as HTML or scripts, so you can't do <b>this</b> or <?php echo("this"); ?> or <script>alert("this");</script> So just turn "<" into "<" and ">" into ">" and you should be good to go? You need to make sure you can't SQL inject, too (the issue with those quotation characters, I imagine) - I obviously haven't thought of this too far and I'm sure there's a bunch of issues like that. There usually are libraries to do input sanitizing, aren't there? Then, Markdown can handle the rest as normal, which sounds like it's the harder issue with specifying a grammar and building a lexer+parser off it. Markdown would probably ignore things like “ or ❝ or 🐐 and treat them as normal characters. Get it together, goat. Wash your hands more often! Saw the goat on my phone but not on my desktop browser :(Sanatizing it often causes its own problems including losing things like spaces, or having a ridiculous regex that depends on knowing what the user intends on inputting. For example are we going to sanitize for "? Well what about “ or ❝ ?
🐐 here is an unsanitized goat.
Yeah I too am ignorant for the reasons they did this, and feel like your method is probably a better mix of the right direction approach. Computationally it might have something to do with it, there might be some data to support looking up a hash in postgresql vs a string is computationally better than generating some text processing. Like I said though "I dunno!"
It was pretty aggro. You'd open up Reddit, your envelope would have a three-digit number next to it, and if you hovered over anything you saw, you joined the party. Whole site went down for about 18 hours. It was one of those examples of precocious kids going "hmm - what happens when I do this?" and what happens is AWS gets dragged down by a substantial percentage because Reddit's that much of a hog. Dude did a Q&A about three days later (this was a year or two before IAmA existed) but it's lost to time.
Some unsolicited suggestions: I highly recommend rdiff-backup for source code. Also PLEASE guys do not keep backups on the same server as production. Too many devs I've seen lose everything because of a misplaced command hard drives are cheaper and easier than rewriting code! I'm glad to hear your implementing this!
^_^from: rob05c to: mk date: Fri, Jul 10, 2015 at 12:54 AM subject: Hubski Backups Ok, I set up my server to back up hubski's user data daily. It's incremental, so it shouldn't suck bandwidth, but if it's an issue, let me know and I'll kill it. I'm specifically running rdiff-backup
I would also like to thank everyone that expressed interest in joining us. But, it was clear that no one could match the bribe that rob05c paid. In all seriousness, we really do appreciate it. I am sure that this won't be the last team member we will need, and it's great to know that we don't have to look outside the community to when the time comes. We keep track of all those who have expressed interest.
What mk actually means by this is that he has to go through the code and remove all the comment code that contains wise-ass remarks, swearing, complaining, and general unprofessional looking stuff. ;)once we have made sure that there are no vulnerabilities exposed in the release.
No that's forwardslash.
commit ae878fc8b9741d099a4145617e4a48cbeb390623
Author: forwardslash
Date: Fri Jun 11 01:44:02 2015 +0000
Fuck
Change-Id: I2072f93ad6c889d534b04009671147af653048e7
I once worked at a law firm where a word processor got sacked because of this. It's common practice for word processors to put placeholder phrases in long documents, to keep track of changes and so forth--near the end, they'll globally search, replace, sanitize, then save the safe public version. This instance, the lawyer needed the document asap for something last minute/urgent, the word processor working on the document wasn't around, he took the document that was on the server, presented it to his client... Only at this point did the lawyer realize the document he'd handed the client was filled with "FUCK" and "SHIT" and so forth. Bam. Word processor was fired that day.
"It's got a light side, a dark side, and it holds the universe together" - Or, alternatively, "If women don't find you handsome, they can at least find you handy"
Dude the entirety of the internet is held up with handshakes and duct tape. I will not judge anyone for using it. (I may judge someone for using it then making it seem as if they've made gold from lead though) If anyone doubts this statement, please look into bgp routing. Then look at the exhaustion of ipv4 space and the near 5% adoption of worldwide ipv6 space. It is insane to me that the internet even works as well as it does. The fact that we have wireless connections to the network doubly so.
I still think the biggest feature missing from tags is something like StackOverflow's "tag wiki". This is possible for personal tags, but obviously not for public tags. Perhaps, on public tags, since Hubski rejects the idea of moderators (though filtering out a community tag is a rough approximiation - e.g. #spam - I'm not sure how sustainable that is though), top taggers could enter a one-lline description for that tag. Speaking of tag descriptions, when you tag a new/edited post, you should have a popup with tab-completion and some of the summary. Among other things, this would minimize tpyos.
I couldn't be happier to have you on the team rob05c. You get what we are trying to accomplish, you are assertive, you are a hubskier and most importantly you are very good at what you do. With you, forwardslash and mk at the programatic helm, the future of Hubski looks good. Now, let's get us an API....