a thoughtful web.
Good ideas and conversation. No ads, no tracking.   Login or Take a Tour!
comment
rob05c  ·  3407 days ago  ·  link  ·    ·  parent  ·  post: Hubski development progress update: internal api good, Arc bad, external api sooner.

    Is your dataset normalized?

It's BCNF, if you consider processed data unique, and consider null a value. But it stores processed data, which ought to be computed at runtime. For example, text, md, searchtext. The application has to be changed to fix that. That will come later.

    weed out some unnecessary/problem columns?

There are no unnecessary columns at this point. We just converted the data to SQL. I did try removing the massive searchtext table from the query. Didn't help.

    Does the Arc SQL adapter support the use of stored procedures?

No, and I will avoid them, unless absolutely necessary for performance. Code should be in the application.

Right now, it's looking like it just needs to do bulk loading. It's not the network. When the app starts, we load ~250k publications. With 13 tables, that's 3 million queries. Bulk loading makes it 14 queries. Even for several hundred megabytes of data, that many queries far outweighs the cost of the data itself. I've faced this problem professionally before, and seen a 20× speedup in a similar situation.

It also took me 3 months to do a large OOP system. This will not take 3 days. LISP ≫ C++.