2008-04-13
| 00:01 | drewr | Golly, if I want to get the mtime of a file, Google tells me I need to use the Tomcat FileInfo class. Is there a better way? |
| 00:02 | drewr | Surely there's something in java.io.*. |
| 00:06 | abrooks | drewr: Java boldly refuses to acknowledge that there is an underlying platform. What you're looking for may be there but I suspect not. |
| 00:08 | jonathan__ | hmmm, see this --> http://www.bmsi.com/java/posix/docs/posix.File.html |
| 00:11 | jonathan__ | but it looks like java.io.File can get you the last modified --> http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html |
| 00:11 | abrooks | posix.* is not part of the Java distribution from anyone. :( |
| 00:19 | drewr | Heh, lastModified()... ugh. |
| 00:19 | jonathan__ | === mtime ? |
| 00:20 | drewr | What about the other things you might need to know? inode, symlink, etc.? |
| 00:22 | abrooks | Java is its own platform. It's not a good platform for building system tools without third-party classes (JNI based). |
| 00:23 | abrooks | http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4042001 (symlink support) |
| 00:23 | jonathan__ | yeah, I work in "enterprise" software, and we'd typically never need stuff like that ... sadly we use C++, if only we could use Java |
| 00:23 | abrooks | There are lots of RFIs for platform support. |
| 00:24 | drewr | This philosophy never made sense to me. So many problems have been solved by operating systems that you shouldn't have to re-solve. :-) |
| 00:24 | jonathan__ | RFI? |
| 00:24 | abrooks | The GNU Classpath project is extending some base classes. It would be nice if they'd support posix-y-gnu-ish interfaces. |
| 00:25 | abrooks | jonathan__: RFE, sorry. Request For Enhancement. |
| 00:29 | drewr | I'm looking at Clojure for migrating some data concurrently between SQL Server and Postgres with JDBC. *That* should be well-supported. |
| 00:29 | abrooks | That would be Java's domain. :) |
| 00:29 | jonathan__ | Ok, I don't know about pg, but the jtds 1.2 driver works like a champ with sql server |
| 00:30 | jonathan__ | and the pure java Oracle thin drivers rock also |
| 00:30 | drewr | jonathan__: Awesome, thanks. |
| 00:31 | drewr | I've used the thin driver for ORA before. |
| 00:31 | drewr | It did work well. |
| 00:31 | jonathan__ | I tried and tried but *strangely*, the MS driver for SQL Server completely failed to connect |
| 00:31 | jonathan__ | </sarcasm> |
| 00:31 | drewr | Hey, of course, http://jdbc.postgresql.org/. |
| 00:34 | drewr | Wonder what the best way of approaching this would be. Have agents bite off a chunk of rows and each work independently? |
| 00:34 | jonathan__ | What are you trying to do? |
| 00:35 | drewr | We've got massive amounts of data that comes off our telecom platform, which only talks SQL Server. |
| 00:36 | drewr | In order to do manipulate it and report on it, we bring it over to PG. |
| 00:36 | drewr | The process for doing that is extremely slow. |
| 00:37 | drewr | I think that doing it concurrently will speed things up. |
| 00:39 | jonathan__ | What's the fastest that pg will slurp in data? Can you generate a bulk insert file? Or are you using other methods? |
| 00:39 | jonathan__ | (assuming pg supports stuff like that) |
| 00:40 | drewr | I've only tried DTS with SQL Server so far. |
| 00:40 | drewr | It's dog-slow. |
| 00:40 | drewr | Literally days to get a single dump. |
| 00:41 | drewr | That's why I'm going to write something that's more efficient, but if I do it sequentially I'm afraid I'll have the same problem. |
| 00:41 | drewr | ...doing 100 or 1000 rows at a time. |
| 00:44 | jonathan__ | so you use DTS to generate data to a text file? |
| 00:44 | drewr | So my n�ive idea is to have a pointer to the current row that gets updated in a Clojure transaction every time an agent grabs his dataset. |
| 00:44 | drewr | jonathan__: No, it moves it straight into PG. |
| 00:44 | drewr | s/moves/copies/ |
| 00:50 | jonathan__ | Sounds like the overhead of using DTS/ODBC(?) may be the problem, rather than being sequential ... but obviously I could be totally wrong |
| 00:50 | jonathan__ | heh |
| 00:52 | drewr | True, it could be. I need to profile it better to see where the bottleneck is. |
| 00:56 | jonathan__ | Assuming round-tripping is the problem, I'd be looking to try and generate something that could be read by the copy command ... http://www.commandprompt.com/community/pgdocs8/sql-copy |
| 00:59 | jonathan__ | Hopefully SQL server should be able to spit out CSV files at 10s of k rows a sec |
| 01:00 | jonathan__ | versus 200 rows a sec which sounds like what you may be seeing |
| 01:00 | drewr | That's probaby the ballpark |
| 01:00 | drewr | I don't really want to generate intermediate data, but I may have to. |
| 01:02 | jonathan__ | yeah, escaping text data can be a pain etc ... |
| 01:03 | jonathan__ | which reminds me, does emit escape data yet ... *my* version does :) |
| 01:16 | Chouser | Extremely primitive log of this channel for the past couple months: http://n01se.net/chouser/clojure-log/ |
| 01:17 | Chouser | Let me know if you see any data errors. The format obviously needs improvement. |
| 01:17 | drewr | Chouser: Cool, thanks. |
| 01:17 | drewr | I'm off to bed. Thanks for the brainstorming guys. |
| 01:18 | jonathan__ | cool, should there be a notice that the channel is archived? or is that pretty common for irc? |
| 01:19 | Chouser | jonathan__: I dunno. To suggest that anything said here is private seems a bit of a stretch though. |
| 01:20 | Chouser | it's not automatically updated yet. Hopefully I can add that tomorrow. |
| 01:20 | Chouser | rhickey already mentioned he liked the idea. I guess if people have objections I can take the pages back down. |
| 01:21 | Chouser | Past my bedtime. Later! |
| 01:21 | jonathan__ | cheers |
| 20:14 | Chouser | http://n01se.net/chouser/clojure-log/2008-04-13.html |
| 20:15 | Chouser | that's the IRC log for the last couple months. |
| 20:15 | rhickey | cool |
| 20:15 | Chouser | I think it'd be most useful if we can get Google to index it. |
| 20:16 | Chouser | rhickey: are you interested in hosting it at clojure.org, or should I let google have at it on my own domain? |
| 20:17 | rhickey | clojure.org maps to sf right now |
| 20:17 | Chouser | ok, that's fine. It's just html and js file, no cgis or servlets or anything. |
| 20:17 | Chouser | or n01se.net/clojure_log is fine with me too, just thought I'd ask. |
| 20:17 | rhickey | I'd have to get some automated way to upload it regularly |
| 20:18 | Chouser | yeah, rsync over ssh would be preferred (that's how I'm getting it onto n01se), but ftp or whatever is fine too. |
| 20:20 | rhickey | Let me think about it - still catching up, was away this weekend |
| 20:25 | Chouser | np |
| 20:25 | Chouser | and no rush either |