#clojure logs

2008-04-13

00:01drewrGolly, if I want to get the mtime of a file, Google tells me I need to use the Tomcat FileInfo class. Is there a better way?
00:02drewrSurely there's something in java.io.*.
00:06abrooksdrewr: Java boldly refuses to acknowledge that there is an underlying platform. What you're looking for may be there but I suspect not.
00:08jonathan__hmmm, see this --> http://www.bmsi.com/java/posix/docs/posix.File.html
00:11jonathan__but it looks like java.io.File can get you the last modified --> http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html
00:11abrooksposix.* is not part of the Java distribution from anyone. :(
00:19drewrHeh, lastModified()... ugh.
00:19jonathan__=== mtime ?
00:20drewrWhat about the other things you might need to know? inode, symlink, etc.?
00:22abrooksJava is its own platform. It's not a good platform for building system tools without third-party classes (JNI based).
00:23abrookshttp://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4042001 (symlink support)
00:23jonathan__yeah, I work in "enterprise" software, and we'd typically never need stuff like that ... sadly we use C++, if only we could use Java
00:23abrooksThere are lots of RFIs for platform support.
00:24drewrThis philosophy never made sense to me. So many problems have been solved by operating systems that you shouldn't have to re-solve. :-)
00:24jonathan__RFI?
00:24abrooksThe GNU Classpath project is extending some base classes. It would be nice if they'd support posix-y-gnu-ish interfaces.
00:25abrooksjonathan__: RFE, sorry. Request For Enhancement.
00:29drewrI'm looking at Clojure for migrating some data concurrently between SQL Server and Postgres with JDBC. *That* should be well-supported.
00:29abrooksThat would be Java's domain. :)
00:29jonathan__Ok, I don't know about pg, but the jtds 1.2 driver works like a champ with sql server
00:30jonathan__and the pure java Oracle thin drivers rock also
00:30drewrjonathan__: Awesome, thanks.
00:31drewrI've used the thin driver for ORA before.
00:31drewrIt did work well.
00:31jonathan__I tried and tried but *strangely*, the MS driver for SQL Server completely failed to connect
00:31jonathan__</sarcasm>
00:31drewrHey, of course, http://jdbc.postgresql.org/.
00:34drewrWonder what the best way of approaching this would be. Have agents bite off a chunk of rows and each work independently?
00:34jonathan__What are you trying to do?
00:35drewrWe've got massive amounts of data that comes off our telecom platform, which only talks SQL Server.
00:36drewrIn order to do manipulate it and report on it, we bring it over to PG.
00:36drewrThe process for doing that is extremely slow.
00:37drewrI think that doing it concurrently will speed things up.
00:39jonathan__What's the fastest that pg will slurp in data? Can you generate a bulk insert file? Or are you using other methods?
00:39jonathan__(assuming pg supports stuff like that)
00:40drewrI've only tried DTS with SQL Server so far.
00:40drewrIt's dog-slow.
00:40drewrLiterally days to get a single dump.
00:41drewrThat's why I'm going to write something that's more efficient, but if I do it sequentially I'm afraid I'll have the same problem.
00:41drewr...doing 100 or 1000 rows at a time.
00:44jonathan__so you use DTS to generate data to a text file?
00:44drewrSo my n�ive idea is to have a pointer to the current row that gets updated in a Clojure transaction every time an agent grabs his dataset.
00:44drewrjonathan__: No, it moves it straight into PG.
00:44drewrs/moves/copies/
00:50jonathan__Sounds like the overhead of using DTS/ODBC(?) may be the problem, rather than being sequential ... but obviously I could be totally wrong
00:50jonathan__heh
00:52drewrTrue, it could be. I need to profile it better to see where the bottleneck is.
00:56jonathan__Assuming round-tripping is the problem, I'd be looking to try and generate something that could be read by the copy command ... http://www.commandprompt.com/community/pgdocs8/sql-copy
00:59jonathan__Hopefully SQL server should be able to spit out CSV files at 10s of k rows a sec
01:00jonathan__versus 200 rows a sec which sounds like what you may be seeing
01:00drewrThat's probaby the ballpark
01:00drewrI don't really want to generate intermediate data, but I may have to.
01:02jonathan__yeah, escaping text data can be a pain etc ...
01:03jonathan__which reminds me, does emit escape data yet ... *my* version does :)
01:16ChouserExtremely primitive log of this channel for the past couple months: http://n01se.net/chouser/clojure-log/
01:17ChouserLet me know if you see any data errors. The format obviously needs improvement.
01:17drewrChouser: Cool, thanks.
01:17drewrI'm off to bed. Thanks for the brainstorming guys.
01:18jonathan__cool, should there be a notice that the channel is archived? or is that pretty common for irc?
01:19Chouserjonathan__: I dunno. To suggest that anything said here is private seems a bit of a stretch though.
01:20Chouserit's not automatically updated yet. Hopefully I can add that tomorrow.
01:20Chouserrhickey already mentioned he liked the idea. I guess if people have objections I can take the pages back down.
01:21ChouserPast my bedtime. Later!
01:21jonathan__cheers
20:14Chouserhttp://n01se.net/chouser/clojure-log/2008-04-13.html
20:15Chouserthat's the IRC log for the last couple months.
20:15rhickeycool
20:15ChouserI think it'd be most useful if we can get Google to index it.
20:16Chouserrhickey: are you interested in hosting it at clojure.org, or should I let google have at it on my own domain?
20:17rhickeyclojure.org maps to sf right now
20:17Chouserok, that's fine. It's just html and js file, no cgis or servlets or anything.
20:17Chouseror n01se.net/clojure_log is fine with me too, just thought I'd ask.
20:17rhickeyI'd have to get some automated way to upload it regularly
20:18Chouseryeah, rsync over ssh would be preferred (that's how I'm getting it onto n01se), but ftp or whatever is fine too.
20:20rhickeyLet me think about it - still catching up, was away this weekend
20:25Chousernp
20:25Chouserand no rush either