Category Archives: SC13

SC13 – 4th SC Workshop on Petascale (Big) Data Analytics

I am spending this week attending the Supercomputing 2013 Conference.


I have my old boss, Darrell Black to thank for this.  We were trying to get a bunch of ex Lewanee’s together for lunch and Darrell, who now works for Intel, couldn’t go to lunch  – he was attending SC13.  It’s good timing as we are looking at new HPC equipment to render animations from 3DS MAX – so here I am.  My goal is to find someone from NVIDIA who I can discuss CUDA and IRAY with.

Today is all about cool science.

I sat in the “4th SC Workshop on Petascale (Big) Data Analytics: Challenges and Opportunities”

The recent decade has witnessed data explosion, and petabyte sized data archives are not uncommon any more. It is estimated that organizations with high end computing (HEC) infrastructures and data centers are doubling the amount of data that they are archiving every year. On the other hand computing infrastructures are becoming more heterogeneous. The first three workshops held with SC10, SC11, and SC12 were a great success. Continuing on this success, in addition to the cloud focus, we propose to broaden the topic of this workshop with an emphasis on middleware infrastructure that facilitates efficient data analytics on big data. The proposed workshop intends to bring together researchers, developers, and practitioners from academia, government, and industry to discuss new and emerging trends in high end computing platforms, programming models, middleware and software services, and outline the data mining and knowledge discovery approaches that can efficiently exploit this modern computing infrastructure.


While it was all about big data, it was really a lesson in fusion.  I listened to talks about the Fusion Simulation Program and The National Spherical Torus Experiment (NSTX). I didn’t know that fusion isn’t cold, the current state of the art is burning hot plasma generated by combining hydrogen isotopes deuterium and tritium under pressure and heat.  They can fuse together and produce helium and energy.  The containment is handled by a magnetic field with the hot gases swirling around inside.  Cool stuff.  Each test “shot” costs a million dollars and generates a large chunk of data, 2.5 GB if I saw the number correctly.  The shot lasts 10 seconds.

They are working to build a much larger reactor which will maintain ignition for 500 to 1000 seconds.  The International Thermonuclear Experimental Reactor – ITER.  The Data is going to get really large and need to be shared all over the world.  Someone is going to be buying a bigger network.  Something in the 10Gbps order I would guess.

More than just fusion.

The Oak Ridge Leadership Computing Facility (OLCF) Science Director Jack Wells talked about Materials Science and the challenges that even TITAN, the 18,000+ node Cray supercomputer at Oak Ridge National Laboratory has keeping up with data growth.


Chad Steed gave a presentation on visualizing your data and a link to EDEN, a nice open source tool that I want to play with.  He showed the tool being used to analyze the twitter feed.

There was discussion of using nvram in place of dram as part of the classic memory model allowing a program to be able to survive a crash by saving its state and initial vector in memory and a really good discussion on moving data between facilities.

Supercomputing === Brains.

I also got to meet Forrest Hoffman.  He showed the research behind the ForWarn early warning system for the US Forest Service.  While that was interesting, the big thing for me is his contributions to Linux Journal on Extreme Linux and Beowulf clusters.  The Oscar cluster we built for Dr. Hasz’s class Realtime Computing was based on Forrest’s work.  Thank you.  I’ll bet I am the only person to ever use the scheduling tools in OSCAR to run HPJava programs on a 40 node cluster without once using MPI.

I have never been in a room with so many Ph. D.’s.  It was a bit intimidating.  The morning lectures had me thinking I was really in the wrong place.  Everyone seemed to know each other.  It seemed that I was getting looked over, the “who are you?” eyeballing.  By 11:00 am I realized that they all use the Oak Ridge TITAN supercomputer.  That is the common thread in the room.

I am amazed at how many Apple Macbooks there were in the room.  All of the presenters, and over 90% of the audience had their Mac on.  The comforting part was how many of the presenters had a hard time adjusting their laptop to the projector.  We always seem to have that problem in our office.

Tomorrow is Python, all day Python.  I am excited.