By gzt (Wed May 08, 2013 at 10:18:24 PM EST) gzt, k-means, head leaking, classes (all tags)
right after my last final yesterday, my throat developed a twinge, by evening it was hurting, and now I have a full-blown case of the sniffles. ugh.

JPEG compression can usually get away with 10-20 fold compression before it gets noticeable. I think this k-means idea, in this naive implementation, gets 7-20 if you use block size 3 or 5, but there are things that can be done to tweak that a little (eg, using short ints where possible to store some things). But there's an idea for reducing one of the big components by a third, which makes another factor of two. anyway, writing up report on stuff and things right now.

proctored a final this morning. grading tomorrow.

qualifying exam next thursday. two sections: methods and theory. methods is always a pain, since it is, essentially, just a grab bag of loosely connected stuff. theory should be fine, i just have to make sure i know the definitions and results of all the theorems. don't even need all the proofs!

starting next year, they've replaced the phd-level probability theory courses: it used to be 2 breakneck pace 4-credit courses, now it's a series of 3 3-credit courses, with the first course essentially being real analysis and measure theory. not much actual statistics-specific content in the first.

not sure what to take next semester. i have one open slot, was going to choose between stochastic processes and time series, but it turns out the wild theory guy who does the big data group is teaching nonparametric methods, which is way cool.  or i could just work on research. choices. there's also always the temptation of the phd-level statistical computing class (mostly the theory of it), the continuation of the class i took this semester, but, glancing over the things they cover, i think i need more theory courses under my belt first - and it'd be too much to do that and the 2 phd core courses. i'll talk about it over the summer at some point with the instructor, as he's the guy i'm starting research with.

my head is leaking.

better beat JPEG by wumpus (2.00 / 0) #1 Fri May 10, 2013 at 08:43:03 AM EST
considering both JPEG2000 (wavelet) and JPEG XR (a scheme I can't begin to understand*) are said to get twice JPEG compression before loss becomes noticeable (and JPEG2000 does even better at higher levels of compression due to lack of blocks).


XR uses both two block levels (said to typically be 4x4) and a lifting scheme. I'm not sure if this computes something almost but not quite a DCT (it can be reconstructed exactly) or computing a wavelet in a dumb enough fashion to not have prior art.

JPEG by gzt (2.00 / 0) #2 Fri May 10, 2013 at 12:06:50 PM EST
yeah, hard to beat an expert group that's been working for 20+ years. the more interesting aspect here is using the clustering later for feature recognition or identification so you end up with a compact representation of what you're interested in knowing from the image rather than necessarily a compressed image. but i will have to, eventually, figure out all the JPEG stuff and can then explain XR to you.

