The Gulf of Naples as seen from Princeton

The Gulf of Naples as seen from Castel dell'Ovo

The Gulf of Naples as seen from Castel dell’Ovo

I’m just back in Princeton, and I decided to share with you this picture taken few days ago during a walk in Naples (Napoli).

I really enjoyed my stay at Home in Italy, having dinner with my family, visiting some relatives, going around in some amazing places, sharing few beers with friends of a lifetime or just met!

But, please, don’t get me wrong: I’m happy to be back to my usual life, skipping dinner alone, “making the science”, playing volleyball with my colleagues,  training MMA, sharing few beers with friends of a lifetime or just met!

Happy birthday to me!

NECLA’s Annual Volleyball Tournament 2013: the Champions!

The Computing Systems Architectur team, winner of the tournament

The Computing Systems Architecture team, winner of the tournament

COSMIC: middleware for high performance and reliable multiprocessing on Xeon Phi coprocessors

In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing (HPDC ’13). ACM, New York, NY, USA, 215-226.

Intel_Xeon_Phi_PCIe_CardIt is remarkably easy to offload processing to Intel’s newest manycore coprocessor, the Xeon Phi: it supports a popular ISA (x86-based), a popular OS (Linux) and a popular programming model (OpenMP). Easy portability is attracting programmer efforts to achieve high performance for many applications. But Linux makes it easy for different users to share the Xeon Phi coprocessor, and multiprocessing inefficiencies can easily offset gains made by individual programmers. Our experiments on a production, high-performance Xeon server with multiple Xeon Phi coprocessors show that coprocessor multiprocessing not only slows down the processes but also introduces unreliability (some processes crash unexpectedly).

We propose a new, user-level middleware called COSMIC that improves performance and reliability of multiprocessing on coprocessors like the Xeon Phi. COSMIC seamlessly fits in the existing Xeon Phi software stack and is transparent to programmers. It manages Xeon Phi processes that execute parallel regions offloaded to the coprocessors. Offloads typically have programmer-driven performance directives like thread and affinity requirements. COSMIC does fair scheduling of both processes and offloads, and takes into account conflicting requirements of offloads belonging to different processes. By doing so, it has two benefits. First, it improves multiprocessing performance by preventing thread and memory oversubscription, by avoiding inter-offload interference and by reducing load imbalance on coprocessors and cores. Second, it increases multiprocessing reliability by exploiting programmer-specified per-process coprocessor memory requirements to completely avoid memory oversubscription and crashes. Our experiments on several representative Xeon Phi workloads show that, in a multiprocessing environment, COSMIC improves average core utilization by up to 3 times, reduces make-span by up to 52%, reduces average process latency (turn-around-time) by 70%, and completely eliminates process crashes.

Continue reading the complete paper …

CRUX PPC 3.0 released!

CRUX PPC LogoCRUX PPC 3.0 is now available. Toolchain ships with Graphite support (PPL backend) and also with LTO (Link Time Optimization).
CRUX PPC 3.0 is released as two different archives: 32bit and 64bit. The 32bit version is based on a single lib toolchain instead the 64bit one comes with a multilib toolchain. These two versions share the same ports tree.

Download and more information on the official website: http://cruxppc.org

 

Adding a method for computing Cartesian Product to Groovy’s Collection(s)

groovy-cartesian-productIn these days I’m using the Groovy programming language very often, I found this language very intuitive and expressive. I try to use, when it is appropriate and convenient , Functional programming style and methods.

One of the key elements of functional programming paradigm (opposite to the imperative paradigm) is “thinking in  space rather than thinking in time”, this translates in a extensive usage of collections and constructs for creating a collection based on existing collections. The most common collection used is the list, the syntactic construct for creating a list based on existing lists is named List comprehension.

I think that the list, or more generic collection, comprehension in Groovy is very powerful (Groovy Collection API), and in my everyday usage I found that it has everything that I need to express the algorithm that I implement in terms of Collection comprehension. By the way, more that once I needed to obtain the Cartesian product of two collections, so I thought it is nice to have a method in Collection for computing the Cartesian product.

Cartesian product

The Cartesian product is a mathematical operation which returns a set (or product set) from multiple sets. That is, for sets A and B, the Cartesian product A × B is the set of all ordered pairs (a,b) where a ∈ A and b ∈ B:

f(A, B) = \bigcup_{a\in A}\bigcup_{b\in B} (a, b).

Read more of this post

Learning Functional Programming: a K-Means implementation in Groovy

kmeansSince few days I started studying the Functional Programming paradigm, so I decide practicing  by implementing a commonly used algorithm. Although Groovy it isn’t strictly a functional programming language, it has all the characters of a functional programming language.

K-Means

K-Means is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean.

Read more of this post