AMUSE Workshop Oct 2009

= Monday 2009/10/05 =

Introduction to AMUSE
Simon, 9:40 AM.

Welcome to Leiden; half year ago moved from Amsterdam

AMUSE = Astrophysics Multipurpose Software Environment

Funny meeting; soft of a MODEST meeting (dense stellar systems), but it has expanded into "modeling the Unvierse"-- less modest a goal. Dense star clusters = gravitational dyanmics. Now we do a lot more.

Image of Tarantula Nebula. At the core is a very dense star cluter, R136, 2000 stars. Whole system includes all the gas around it. Gas and cluster influence each other.

Image : cosmic finger of friendship. "What Nature is trying to say."

In the last 400 years, telescopes (Galileo to VLT) have improved by a factor of 10$6$-- become that much larger. For the last 40 years, computers have become smaller... although the truth is that the most powerful computers are the same size (lots of cabinets), but they're way more powerful: factor of 10$9$ faster.

Image: Moores law in supercomptuers of the last decade or so. Only 10% is academic.

Images: compare Springer et al. simulation to SDSS. These simulations are typically only done with gravity (cold dark matter). Cosmologists today, though, get interesting in star clusters; they have now gotten to the point in their.

Calculating Galaxy (just gravity) : 10$11$ stars (squared) * 10$11$ years / 2.5x10$8$ years * 100 steps * 60 operations = number of operations = a lot.

(10$11$ years = age of univ, 2.5x10$8$ years = time of orbit, 100 steps per orbit, 60 operations per step).

Arches cluster: discovered in Radio, only realized there was a clsuter of 10,000 stars with IR image later. Interaction between gas, stars, stellar evolution. All of this is important.

Image: artist's impression of a white dwarf acreting from a red giant star. We're 10 years away probably from simualting this with the binary system, the magnetic fields on the gas, the acretion, etc. Not just parameterized, but really simulating the physics.

GRAPE-4 on to GRAPE-6, on to NVIDIA Tesla. GPU physics.

Second Life, MICA.

Intercontinental distributed GRAPE-6 GRID: http://modesta.science.uva.nl. Honestly, it never really worked. Gravity doesn't parallelize very well (direct N-body).... Latency in link betwen Amsterdam and tokyo is about 1/3 of a second.

Image: orion nebula on lots of scales. Gas falling, gas shocking, gas forming stars, star forming region, star clusters, binaries, stars evolve, stars expelling gas, etc. Everything is happening here. This is a tiny part of the galaxy.

Modelling all of this: AMUSE.

Flow control layer (scripting language)

Interface layer (scripting and high-level lagnuages) Gas dynamics  Rad. Transpt   Stellar Evol. Stellar dyn.

options        options         options        options SPH           Metropolis      Henye          4th order Hermite Hastings      multi-shell    block timestep Monte Carlo    stellar evol. N-body

Modular, so you can plug and iunplug different algorithms and modules and codes on the lowest layer.

Scripting language in python, do the interaces using MPI.

Part of the science is designing all the algorithms. Part of the science is done at the flow control layer. (How often do we have to invoke stellar evolution to come up with a realistic answer?) Ideally, students and astronomers and anybody can write a simple script (in python) at the flow control layer.

Why are we here if there are programmers hired in Leiden to write this stuff? International team effort. Open source endeavor. Everybody wants to use it, we want to build community to do it.

Introduction to the AMUSE software
Munday 10:30 Arjen van Elteren

Goal: Physicis : To create an instrument for perofrming astrophysical experiments spanning multiple physical domains and a wide range in temporal and spatial scales.

Software: combining existing, proven codes into a loosley coupled but consistent software environment, without changing the legacy codes (much)!

Roadmap: Design->Code->Legacy Interface->Next Level->Units->Run an application

Design overview: application domains starting with: Gravity, stellar evolution, hydrodynamics, radiative transfer. Need to deal with input and output of data, as each appliation does it in its own format. Want to input/output from all of them so you can use AMUSE only as it. Support functions in the framework include unit handling, data conversion, initial conditions, and comparing models.

Design blocks
Module ---                      Unit conversion Interface <> Legacy code

Data model Input/Output Tree, Grid, Set

Design levels -- like the one simon showed.

DAta Model, Flow Control Python Interfaces   Units/I/O MPI              HDF Legacy Codes

HDF is an established data format.

CODE : Most code is in Python. Legacy codes are in C/C++, FORTRAN. Interface between Python and Legacy code via MPI. Organized in a directory structure. (Python organizes its packages in that sort of structure.)

Code lives in a svn archive.

Directory Structure

 * trunk
 * src : source code of environment and included legacy modules
 * test : unittesting code and examples
 * support+bin : building, etc
 * doc

SRC directory: everythign in one master package amuse. Subpackages = legacy, support, domain. support is not divided in physical domains; all of legacy codes, each code a sub-package. support is for unit handling, data model, i/o. domain is domain specific definitions of interfaces, attributes, and parameters.

TEST directory: examples using the amuse framework. legacy tests, unittests for legacy codes; tests of the MPI interfaces and very simple problems (sun and earth for gravitational dynamics). core_tests, tests of the support package.

Legacy Interface
Based on function/subroutine calls. Basic arguments; no structures, only numbers. Function calls are *remote*, from ppython appliaction to legacy code application. Function calls are over MPI; MPI is used to transfer data from python to legacy code.

Python interface is defined as a python class, some special syntax needed to specify the remote functions (python decorator @legacy_function).

Question arises about sending huge amounts of data; goes up to the top from the module, then back down to another module. More overhead than tunnelling, sending directly between codes. Current bottleneck isn't performance, but functinoality, so current goal is to get it to run. Harder to keep things loosely coupeld with tunenlling....

EXAMPLE
Python:     class Interface(LegacyInterface):

@legacy_function def sum: function = RemoteFunction fucntion.add_parameter('x', 'd', function.IN) function.result_type = 'd'                     return function

C code      double sum(double x, double y) { return x+y; }

Python script x = new Interface result = x.sum(10.5, 31.5) present result

MPI: how it works
Interface                   generic class                     handling        main                legacy --Method--> function<-MPI->event-loop-function->code Defined        calls       calls         generated! call during itnegration

|---Python-|               |C, Whatever -|

Existing parallel codes can work with this model. Communication with legacy code is over a "speicial" intra-communication channel. A little bit more work with an existing MPI aplication, but is possible to use it.

MPI is also used to start the legacy code (MPI 2 spawn).

TODO: events to send back information to the python script (collision partners, diagnostic values). Can send arrays, need handling to put these into legacy codes). something else I missed.

Next level : use legacy functions to interactiwith:


 * parametrs : code-wide specific settings, usually changed only at start of calculation


 * data (attributes of particles or of gridpoints), set the initial conditions, retrieve the latest calculated values.


 * evolve, let the legacy code evolve the model


 * diagnostics, get important information about the state of the code (energy, time, ...)


 * services, calcualte values based on the model (e.g. gravity at a given point). Existing applications usually don't have these, but often you need these when you start combining codes.

All values in a python script have units; legacy code only knows about numbers, units are implicit. The next level combines the numbers with units, into values, or strips the units from the values to get the numbers to send to the legacy code.

Legacy codes may know about ids: next level connects ids to objects (stars, gas particles) with attributes.

Units and quantities. Default base system is SI (kg, m, s, K, A, mol,cd). Units can be composed of other untis (m/s). Can be derived (parsecs).

Another base system is NBODY (G=1).

"Quantity" is a value combined with a unit. All operations work on quantities.

example: >>> parsec = named('parsec', 'pc', 3.08568025e16 * m)        >>> kparsec = 1000 * parsec >>> print kparsec unit<1000 * parsec> >>> value1 = 10 | kparsec >>> print value1 10 1000 * parsec >>> value2 = 1 | parsec >>> print value2 1 parsec >>> value2 + value1 quantity<10001.0 parsec> >>> (value2 + value1).in_(kparsec) quantity<10.001 1000 * parsec> >>>        >>> convert = nbody_to_si(1 | MSun, 1 | parsec) >>> l = 10 | length         ; length was defined as nbody before >>> l        quantity >>> convert.to_si(l) quantity<3.08568025e+17 m>        >>> convert_to_si(l).in_(parsec) quantity<10.0 parsec>

Questions and discussion
Question ariess: what is transition from MUSE? One issue was dealing with units. Input/output of data was an issue.

Simon: The idea is not to be able to model everything in the whole Universe; want to model things where the timescale and the spatial scales are different enough so that you can use separate codes to model them. If you have to deal with tightly coupled stuff -- if stellar evolution and dynamics are tightly coupled, this won't work, you really have write a code that takes both into account. AMUSE wants to integrate separate codes that work in separate domains where the primary assumptions still work.

Muse -> AMUSE ; political issue. Got money to make an "instrument" in an instrument sort of setting. Simon proposed to see the software environment as a sort of instrument. It was granted, but MUSE already existed as a VLT instrument. Couldn't call the project MUSE.

MUSE to AMUSE transition is professionalization.

One of the main transitions is to go to MPI instead of using swig and f2pi (is this right?). Thread safeness was a problem. Couldn't instantiate the same code multiple times.

Goal is to prevent touching the existing code as much as possible. You make a mess out of them. You have to know them. If it's changed upstream, it's huge work to get it involved again.

Would like to have documentation on the interfaces! Harder in the case of gas dynamics and radiative transfer; gotta figure out what is needed. Stellar dynamics is completely specified now, but probably not fully documented (everybody just knows it-- particle only knows position, velocity, mass). Evert will talk about stellar evolution shortly. Have external parameters, but maybe also internal parameters (stellar structure).

Ultimate goal is to have the specification and documentation clear enough so that somebody who wants to implement a new code can do so even if the core team doesn't care about implementing the interface to that new code. Simon says two interfaces; minimalist and advanced. Minimalist should be able to be implemented in a day.

ULTIMATE ULTIMATE goal : everybody who is running code can use a network service to run Evert's stellar evolution on Evert's computer transparently, and Evert has to supply CPU for the WHOLE WORLD. That's the next step after this step... networked code services.

Stellar Evolution Interfaces
11:30 AM : Evert Glebbeek on Stellar Evolution Interfaces

Evert is at McMaster, working with Steve McMillan.

He's not specifically talking about AMUSE yet. Has worked with MUSE.

Stellar evolution is calculated using a set of recipes, a set of lookup tables, or a full Henyey-evolution code. Different modules may require different initialization steps internally. E.g. some codes need opacity tables, reaction rates.

Stellar module must define a number of functions:
 * initalize_module
 * add_star(mass, metallicity)
 * evolve_star(id, time)            ; evolve to the specified time
 * get_stellar_properties(id)
 * remove_star(id)

Easiest way is to directly call the stellar evolution module. This is easy, but is cumbersome to use multiple different modules, need to repeat lots of standard bookkeeping in user code.

Might need to use different stellar modules for different types of stars or different phases of stellar evolution (PMS stars, white dwarfs). There are also non-canonical stars (e.g. merger products, mass accretors). Different populations (different codes for different ranges of Y or Z). Also, we want to be able to compare evolution tracks. And, of course, there's always the generic benefits of modularity.

"every star knows how to evolve itsef"

class Star(object): module = None # module used to evolve this star public_id = 0 # ID of this star in module

def __init__(self, module, zmas_mass, global_id): self.module = module self.public_id = global_id new_module.add_zams_star(global_id, zams_mass) return

def evolve(self, time): return self.module.evolve(self.public_id, time)

This makes it easier deal with stars; don't have to keep track of different modules. Downside is an extra layer of indirection.

Code for keeping track of all stars can be put in a class and can be re-used: the star manager. This can have code that would otherwise need to be duplicated in different stellar modules. It can also deal with moving a given star to a different module. It can have a global evolve method that evolves all stars to a given age, simplifying user code.

Some stellar evolution codes are really binary evolution codes. May want to take advantage of this. Stars and binaries should be trated the same, then. The stellar module is also a binary module. Bookkeeping can be done in a natural way by the manager.

Some binary algorithms could be coupled to any single star evolution prescription... need modularity for this.

"The data may actually be worse." --Evert Glebbeek

Extra functions needed from stellar evolution

class Star(object):

def gain_mass(self, mass_transfer_rate) def gain_angular_momentum(self, delta_spin)

def lose_mass(self, mass_transfer_rate) def lose_angular_momentum(self, delta_spin)

Questions and Discussion
Want to be able to get these codes into AMUSE; currently in MUSE.

The GAIA Data Processing Framework
12:00, Mark ter Linden, Gaia Science Operations Centre, ESA, Madrid, Spain... and now for something completely different.

Mission and Hardware
GAIA home page at ESA


 * Astronmetric properties of 10$9$ sources (G < 20) at microarcsec accuracy; proper motion.
 * Photometry (G < 20)
 * Radial Velocity (G < 16)
 * Stellite mission launched in Spring 2012
 * 5 year survey mission in L2
 * Daily about 30 GB download from satellite

Harware has two telescopes at 106 degree angle. Each 'scope has 6 mirriors. Both telescopes have the same focal plane. There are 106 CCDs at the focal plane. Entirely optical.

Ground software development
Ground software to produce final catalog. 10$12$ observations; 1 msec per observation would mean 31 years.... Parallel processing necessary. Expected computing power needed: 10$21$ FLOP.

Software development has contributions from large number of institutes and observatories in Europe; funded by the community, not just by ESA. This means that the work needs to be distributed. Software must be maintained until 2020, when final catalog will be produced.

DPAC = Data Processing and analysis Consortium. 360 people involved. Organized into 9 coordination units (CUs) and 6 data processing centers (DPCs).

ESAC (where Mark works) does mainly CU1 (System Architecture) and CU3 (Core processing). 20 persons; half are computer scientists with physics interest, half are physicists with computer science interest. Core processing is done daily, with each download from the satellite. 6 month development cycles, starting from 2006 till launch. Each 6 months, something that works must be delivered.

Must comply with ECSS standard (ESA standards). This standard asks for a lot of documentation... have to make it even if it's not really terribly useful. Includes development plans, required documents design documents, test plans and reports, interface control documents, etc.

Collaboration tools include wiki (sharing information), Mantis (issue tracking), Livelink (document repository), svn (configuration management).

Try to use "Extreme Programming". Agile techniques. Flexible compared to traditional ESA "waterfall" top-down model-- for instance, consider dealing with radiation damage on CCDs. If it turns out to be worse than anticipated, have to deal with that in software. Would be more complicated if they had to go all the way back to the requirement stage in a traditional mode.

Includes monthly planning meetings. Stories (activities) with points (1 point is half a day), limited to 10 points per story.

Coding done entirely in Java. Java is portable (likely to be using a different computer in 2020 compared to today), lots of tools available (including Eclipse development environment), performance isn't so bad any more. Currently 1 million lines of code, expected 3+ million. CU1 organizes Java workshops (1-2/year).

CU1 maintains a software library-- GaiaTools. Goal is to reduce duplication. Tools for data access, numerics, Infrastructure, plotting, etc. Gaia Parameter Database, includes the central repository of mission parameters, numerical cosntants, datasets, where it's important that everybody is using the same one. Can be used via web browser, or directly in Java code. (Can also export to other languages.)

Data model : single data model shared by all CUs. Input of one project is output of another project. Online tool to maintain the model: keeps data model consistent, allows access &amp; modification of the model remotely. Generates source code and database schemas. The Main database is at ESAC, 200TB in final version.

Infrastrcture:


 * Whiteboard-- stores jobs, every node can access it
 * Data trains- run on the nodes, retrieve jobs from whiteboard, retrieve data from store and pass data to the takers which implement the algorithms.
 * Optimized for efficient access to data
 * RMI servers provide auxiliary data. (Try to avoid processes talking to each other as much as possible)

They're experimenting with cloud computing, e.g. amazon E2C. Clouds can provide a large number of nodes. It's easy to port software. Can be cost effective during operations (estimate 4k euro vs. 4M euro in 2015 for AGIS). This makes sense for things like AGIS, which only runs every 6 months after core processing of 6 months of data is done, and where the answers are wanted very quickly. (Cloud is not cost effective where you're going to be running constantly.)

= Tuesday 2009/10/06 =

Science with AMUSE
09:00 : Inti Pelupessy

Introduction
We learned a lot about what AMUSE is yesterday. Development started in May 2009....

Used for research from the begining.


 * what is needed to make AMUSE an effective tool?
 * what are the lessons for the design of AMUSE?
 * what are the problems AMUSE is most suited for?

Inti is doing science with AMUSE as it is being built, so that the tool really will be built with its real goal in mind.

Experience with (A)MUSE
Interactive code testing with python is useful! He was able to find an issue fairly quickly with the linear interpolation he was using in one case. Interactive python is like a really powerful debugger.

Some coding will normally be necessary: helper functions, unimplemented interface functions, bookkeeping, handling parameters, initialization. (Ideally, you have to do nothing to import a new code, but of course that's not realistic.)

f2py + SWIG (ways to make your code a library in python) (used in MUSE) benefits: memory access is more direct, so it's easier to do certain things; you can define callbacks. (Callbacks let you go more directly between codes.)

f2py + SWIG drawbacks: tricky implementation, not well documented, lots of trial &amp; error. Global data collisions (problem for instances-- it's a pain and ugly to make multiple instances of the same code).

MPI benefits: parallel ready (this is key, and one of the main reasons); independent instances.

import of MPI codes not a problem. (At least if you're using MPICH2... this is already a solved problem, they thought about this.)

GPU issues (reinitialize every step to be safe)... often times the CUDA library can't handle multiple instances. This is a not-so-nice issue, but will hopefully improve as CUDA and the whole GPU thing gains maturity.

Codes included and/or under consideration: several listed. (Not copied here.) Grav codes, Hydro codes, Stellar evolution, Radiative Transfer.

Scientific validation: Signpost applications
"Signpost" applications:


 * tests that go beyond unittests
 * reproduce results in the literature
 * multiple physical modules (so there is really some challenge involved)
 * well defined problems; have to avoid things that use lots of data, or things that are very sensitive to the choices of input data made.
 * small (now at least!) problems; this is not a stress test. Note that a supercomputer problem from 10 years ago is a small problem today....
 * quantitative and qualitative comparison

Wish List
Hierarchical Cluster formation; Bonnell et al., 2003. Hierarchical collapse of a molecular cloud and formation of cluster. Uses Gravitational dynamics, hydrodynamics, simple binaries, collisions.

cluster ecology: Portegies Zwart et al., 1997, 1999. Evolution of a cluster with realistic stellar evolution. Uses Grav. dynamics, stellar evolution, binaries, stellar collisions.

Bridge N-body integrator (Fujii et al., 2007). Integrate a direct N-body integrator for a small system embedded in a large system where you use a tree code. Combine two N-body integrators.

Runaway collisions in dense stellar clusters (Glebbeek et al., 2009): gravitational dynamics, stellar collisions, stellar evolution.

Cluster formation with photo-ionization feedback. Requries radiative transfer. Dale et al., 2005. Compare ionization fraction, Jeans mass.

Collapse of an unstable proto-Planetary disk. Mayer et al., 2002. These guys used SPH. Gravitational dynamics, hydrodyanmics, simplified thermodynamic description. compare overdensities, properties of proto-planets. (Apparently after this paper there was a lot of discussion about whether the method they used was incorrect; was it about instabilities in SPH.)

"That's the benefit of computational things; you can invent your own physics." --Inti Pelupessy

Radiative Transfer comparison tests -- readiative transfer is relatively immature. Test in Iliev et al., 2006, 2009. Continuum transfer, simple chemical model (H, He).

Cloud-shock interaction; Agertz et al., 2007. Pure hydrodyanmic test.

Disk-Planet Interaction; combines gravitational dynamics with hydro. More challenging than unstable collapse test.

All of these would be good tests to do.

Cluster dissolution in realistic galaxy invironments
Problem: theoretical disruption timescales are longer than the observed disruption timescales of clusters.

Environmental effects have been studied extensively; e.g. Spitzer 1958, Ostriker et al. 1972, Baumgardt &amp; Makin 2003, Gieles et al. 2006, 2007.

Galaxy potential, disk crossing shocks, spiral arms, GMCs. Tidal disruption potentials; along this list is a set of increased detail, but also shorter disruption timescales.

Are the analytic estimates that form the basis of these studies reliable?

Current approach:

1. run full N-body/hdyro modles of galaxies. 2. determine local tidal field tensor 3. use this as input to (semi) analytic cluster models

This takes the guesswork out of modeling collision rates, tidal interactino strength. Realistic potentials....

Some of this was started even before AMSUE with an Nbody/SPH code. Code coupled to star coluster dissolution model. (Kruijssen et al., 2008, 2009.) What happens to clusters in a galaxy merger? There are two spikes of star formation, esp. at second passage, but the number of clusters decreases, esp. at second passage. (This is a premature result.)

Can we move beyond analytic estimates? Difficult: difference in scales between galaxy and star clusters. In the study so far, cluster evolution was done in semi-analytic formulae. The semi-analytic formulae were tested against N-body models, but in simpler configurations.

This is an ideal application for AMUSE!

1. Extract tidal history for star particles from the simulation, then subject N-body clusters to this (time dependent) tidal tensor

2. Run cluster simulations embedded in full galaxy model (bridge scheme)

Note that N-body codes don't let you specify an external tidal field or any such. However, in the bridge scheme, you kick the stars as a result of the tidal tensor every so often.

Radiation Hydrodyanmcis with AMUSE &amp; simplex
Biggest challenge with AMUSE: incorporating radiative transfer. Last thing in plan, but good to start early because it's hard.

Different radiative transfer: line trasnfer, continuum

Code Inti is working on is SimpleX; Paardekooper et al., 2009. Has a high dynamic range and adaptive grid. It's computationally efficient. Diffuse recombination radiation is included.

Incorporating simplex into AMUSE:


 * mature &amp; tested code
 * need to define interface
 * based on "vertices" (particles)
 * (almost) immediate coupling with particle hydro codes?
 * C++ : one main simualtion class → interface can then be based on a derived class instead of a collection of helper functions.

This is in the very early stages.

Summary

 * Developing AMUSE and applying it in our research at the same time
 * Validation of the project using "Signpost" applications
 * AMUSE appleid to environmental effects in cluster evolution
 * first attempt at radiative transfer started
 * Doing science with AMUSE si fun.

("should be amusing")

On the origin of grid species: The Living Application
9:45, Derek Groen

He uses Windows, but at least he's using OOo....

The problem
He's talking about porting MUSE to the Grid, but doing it in an unusual way. He's been doing a project for the last <= 1 year.

MUSE background. Running hybrid simulations on large infrastructures is non-trivial. This is especially complicated if you want to go beyond your own cluster, and run things across different sites.

Also, each solver is optimally run on different architecture. GRAPEs and GPUs for high-precision N-body integration. Supercomputers for cosmological simulations. Grids of PCs for stellar evolution simualtors. Want to efficiently perform a simulation that switches between solvers and hardware architectures.

What is efficient?


 * Run each solver on an optimal architecture.
 * Minimal overhead
 * Short distance between solver and app. manager; don't to waste time with the scheduler script sending information across the Atlantic
 * No redudant file transfers; if you have to transfer between countries, want to do it directly.
 * No occupation of nodes not directly used for computing. Especially true for supercomputers and clsuters that come with a reservation system.

In short, we need the application to act similar to a user, i.e. turn it into a Living Application. Application has to have some intelligence to figure out what is going on, what needs to happen next.

Living Applications
Living Applications are grid jobs that possess the privilege to act like grid users. They may independently:
 * Transfer files
 * Start new programs on the grid
 * Retrieve output data from these programs

Inti asks what is the difference between his applciation and a virus. Simon says that the first versions were described as being a virus by various supercomptuer centers, and they didn't want to run them. Derek says that they have a clean way of turning it off....

Simple case: two sites.

(( MyProxy ))

N-body --   | site 1 | |       |    --                     Evolution --                    | site 2 | |       |                     --

User starts this. The code reproduces itself and submits itself to other places as appropriate for what needs to be done. If you're not careful, it could go nuts reproducing itself.

User uploads credential to credential management service on MyProxy. The only thing he passes along is a mechanism for the application to acquire a credential. If application wants to move from site1 to site2, it can get the credential by contacting MyProxy. Once the credentials are acquired, it can move to site2. If you think things are going nuts, you can send a command to the proxy to kill your credential. That will stop your job from replicating further.

Derek comments that at thsi point the security is very similar to just launching jobs in the first place.

How to get a grid security certificate: the idea is that you don't have to type in your passwords everywhere, and you don't want to coyp your ssh certificate everywhere. You fill in a form with your name, institute, passport number, etc. You give it to a regional authority. You visit this person, show your passport. Then you get your credential, and you can access the grid. If you are Japanese and visiting Sweden, you have to fly back to Japan to get your credential....  This could use some automation. Once you have your crednetial, you can access grid sites.

Questions re: using MPI on the grid. It's supposed to work, but in practice people have not been able to get this to work. The particle physics grid is optimized for huge numbers of small single-node jobs, as that's what those guys do.

Back to the main flow....

Living applications can terminate themselves in one location, continue in another using a different solver (Migration). They can also duplicate themselves (Cloning).

Testing the Living Applciation
Applied the concept to a multi-scale simualtion of a galaxy merger.

Two n-body integrators; GPU in amsterdam, low-accuracy. high-accuracy on GRAPE-6s at Drexel. Two galaxies with N = 2^10-2^16, plus one balck hole. Near head-on collision. Using direct and tree algorithms, switches whenever d$bh$ crosses a certain value. Use GPU and tree code until the two BHs are close together; switch to direct integration because the two clusters are more intermingled. When the BHs separate enough, return back to tree code.

Two solvers; octree and phiGRAPE. Used MUSE (this is pre-AMUSE). Used Globus grid middleware + PyGlobus. Used MyProxy server. The living simulation python script. A GPU node at UvA. A GRAPE node at Drexel. (Two machines for test setup.)

Performed two profiling tests. First test was scalability over N. Used a fixed d$bh$ of 0.3, where 3 switches occur. Second test was fixed (2$15$) particles, vary d$bh$.

First, shows an example; direct takes 4533s, tree takes 271s, switching takes 2446s. In the later part, the tree code tends to overshoot. The direct and switching codes do show decreasing distance and eventual convergence.

Performance results: time vs. N. Overhead and data transfer is always less than the time spent on direct integrations; for larger N, it also becomes less than the time spent on the tree. (Logarithmic plot.) Note that this was proof of concept, did no optimization of data transfer. Transfer was done over regular Internet. All of this was with a total of 3 switches.

The steepest overhead WRT N is local I/O.... Turned out to be an oops, inefficient writing.

TEST 2 : with d$bh$ of sqrt(0.01), overheads dominate: 29 switches. For sqrt(0.1) (7 switches) and higher, integration dominates.

Living simulation: the next generation
Want to simulate a young star cluster with AMUSE. 100000 stars, two living simulators that can do evolution and N-body at the same time as necessary.

Gravitiy living simulations, GPUs with Sapporo.

Living stellar evolution simulation, on a grid of PCs with EVTwin.

Simplified Collision Resolution (Sticky spheres.) collisions in the dynamcis can effect the evolution of course; that's what makes this interesting, as otherwise you could just run the stellar evolution first, and then use that to influence the dynamics.

Conclusions
Living applications allow complext and efficient workflows across multiple clusters.

Migration time for the test case was 7-120s, whcih is sufficient for course-grained interactions. Could probably be optimized.

Future work: Create next generation. Create generic toolbox for using living applications.

Questions
Inti asks about overheads with larger N. 10 years from now comptuers will be faster, so computation goes down. Derek responds that bandwidth of international networks is also going up; in fact in recent years, faster than computational power. Latency is the one thing that doesn't go down.

Star Formation under non-Milky way conditions with Flash
Seyit Hocuk, 10:45

Flash is a numeric code, not Adobe's Flash

The Flash Code
Adaptive Mesh Refinement simulation code, grid based. Designed for compressible flow problems. Can solve a broad range of astro-physics problems. Can do MHD, particle simulations. (It's N-body in additino to hydrodynamic code.) Can do ionization, explosions.

Has a modular design; can create modules with special physics, can add them. Portable: runs on many massive parallel systems (can run on thousands of CPUs). Capable of handling extreme resolution. Then the problem becomes more visualization than simulation; already he has more problems visualizing them than running them.

Scales very well (up to 40,000 CPUs has been performed). Extendable.

Flash was started in 1997; was interested in thermonuclear flashes on the surface of compact stars (neutron stars and white dwarfs), and type Ia supernovae. (Also to simulate nuclear weapons... that helps get big money.) Developed by a core group of 6 people, with many other people contributing.

Some discussion of how well it really scales; it will depend on the nature of the problem. (There was some talk about "ghost cells" and "guard cells", but I don't have the background to know what that's all about.)

Written mainly in f90, a little bit of C and python. Uses MPI, HDF. Tested on various platforms, compilers, libraries, etc. Mostly Linux platforms.

Examples of refinement. Can refine based on threshold on a parameter (density, temperature, whatever). Or, you can use it based on something like a 2nd derivative of some variable.

Architecture. Question about time step; the whole thing is advanced together. Mesh refinement, then hydrodynamcis, then diffusion, then source terms, then N-body, the materials, then gravity, then I/O.

Modules
Hydrodyamics: Solves Euler equations for gas dynamics. Shocks and contact discontinuities. Uses (Hybrid) Riemann and HLLE solver; can use a different solver for when a shock is detected to improve precision. Suitable for coupling with gravity. Solves MHD equations for manetized fluids. Also has support for RHD (relativistic hydrodyanmics).

Gravity

Particles; active or passive particles (WRT gravity). Particle Mesh method. N-body simulations. Long range forces (but not short range). Sink particles; not implemented, but they're working on it (based on Seyit's request).

Molecular cloud fragmentation: Thermodynamcis
Shows figure: evolution of molecular cloud with overdensity that collapses, forms star and disk. He's looking at the fragmentation process; hwo does it fragment under different conditions? Try different metallicities, rotations.

Physics involved: Hydro, gravity, rotationh, turbulence, shocks, adaibatic and shock heating, metallicity dependent cooling, non-LTE effects included. Not really doing the radiative transfer, has that buried in heating and cooling coefficients.

Initial conditions; looking at starburst galaxies with different metallicities (also dwarfs with low-Z).

Uses a cooling function as a function of temperature, density, and metallicity. (In ergs s$-1$ cm$-3$.) Includes dust, atomic, molecular. There is an assumed ionization fraction at each temperature. (What is the assumed source of ionization? Dunno.  For certain background radiation.  Perhaps assumed no electrons below 10$4$ K?  There are no cosmic rays included in this.)  Cooling and heating treated differently.

Shows some results. At high metallicity it fragments more than at low metallicity. Few hundred solar masses at high (1 Zsun) to few thousand at low (10$-3$ Zsun). Fragments more at high rotation and less at low rotation.

Shows phase diagrams. Get more stuff at higher densities at higher rotation. Steeper T vs n slope at high meatllicities than low metallicities; cools better at high metallicities. When it cools better, it collapses more, can form more compact objects. In many cases you see a two-phase medium. Note that he's plotted gridpoints, so what you see is biased by the refinement. However, the refinement was based on density threshold, so the two-phase statement still holds even if you take the refinement away.

Shows a Jeans Mass plot. Most points lie in range of 50-1000; can go as low as 10 solar masses.

Black Hole, X-ray, Molecular cloud
Recent work : if you have X-rays from a black hole on to a molecular cloud; flow the evolution. Uses chemical input from (somebody else), imports into flash.

Caveat
(One note about Flash : it has a closed licence. You cannot change it and publish the code you've changed.  Also some people can't get it based on their origin....)

(That would probably answer the questions as to "how should FLASH be integrated with AMUSE, and which way"... things with that kind of restriction would probably not be great for AMUSE. AMUSE needs to be completely public.)

RADMC-3D
A versatile and easy-to-use tool for continuum and line radiative transfer; Kees Dullemond.

Purpose of the code is not to do radiative transfer in a hydrodynamical setting. It's not the radiation module of radiation hydrodynamics. It's the post-processing radiative transfer. After you've done hydro and so forth, you want to make prediction of CO map or SED or whatever. It's working but is still under development.

Radiative transfer
Radiative transfer is in fact two topics.

In dynamic modules:
 * Must be extremely fast (RT = bottle neck).
 * High accuracy not feasable (not really necessary)
 * Using mean opacities, flux limited diffusion, simplex-style
 * Must be as parallellizable as hydro
 * Compex on MPI

Post-processing, for comparison to observations
 * Must be very accurate, and frequency dependent
 * Must include complex radiative physics (lines, dust)
 * Must not necessarily be extremely fast. (Still nice, at least for qnd images.)
 * Can often be done on shared-memory machines

He's thinking of a method for making images for an observer at a distance, but also an observer inside. Fly through maps...!

RADMC-3D: Features

 * continuum radiative transfer (dust)
 * Gas trasnfer (for now only LTE, LVG, EscP)
 * 1d, 2d, and 3d
 * Cartesian or spherical (not yet) coordinates
 * Variosu gridding
 * Regular
 * Regular + AMR
 * Currently working on tetrahedral, delaunay, voronoi grid
 * Interface with:
 * FLASH. Can load result from FLASH into this code
 * AMUSE...? Not done yet.

Shows an example of an AMR tree structure. Radiative transfer done in two stages. Stage 1, Monte Carlo: Dust Temperature. Guaranteed energy conservation-- whenever a photon is absorbed, it must be re-emitted (somehow), eventually it escapes. Assuming local energy balance, so that any energy that gets reabsorbed will get reemitted.

(Note that the temperature here may not be consistent with the temperature that came out of the hydro code. Sound speed goes with sqrt(T); peak of Planck functions goes as T$4$.  Hydrodyanmics isn't so effected by error in T, but outgoing spectrum will be hugely effected.  Also dust and gas temperature may well differ.)  (Could skip stage 1, and just use the temperature you got.)

Stage 2: take the temperature structure, ray trace through it from your eye and make an image. (Almost a trivial thing, just ray tracing. Dust scattering makes it more complicated, need to do some mote carlo.)

Using the code
Shows a cloud that fragments and makes some interesting structure. Is assuming some symmetry. Shows result at different wavelengths.

Code comes with a set of user interfaces; shows a GUI written in IDL that uses the IDL widgets.

Shows another example of two colliding flows in galaxy. Density gets high enough that you get efficient cooling and temperature drops quickly.

Different Gridding
Has cubic grid with oct-tree refinement, can go to any level.

Working on tetrahedral or general gridding. Specifify the verticies and the walls, connected in any way you want. It may come out of the Hydro. (A Voroni Hydro simulation exists; Springel has one.)

Issues of Parallelization
Currently a serial code. Simple way of parallelization planned:


 * MPI
 * Each node has FULL grid (possibly a memory issue for large models)
 * Partly embarassing
 * Let 8 cores do MC for 5 minutes
 * Add all cell-energies (gather)
 * Redistribute (broadcast)
 * Recompute the new temperatures
 * Do antoher 5 minutes, etc.

For forseeable future, he's concentrating on problems where all of the data can fit into a single node.

End of slide show, click to exit.

Shows a demo in IDL; only the interface is in IDL. Code itself is in f90. When you futz with the interface, it runs the code on the fly. You start the code, ask it to read the data. Code runs in child mode. Code sits and listens to a pipe asking for new commands. (Stage 1 is already calculated. Stage 2 is done on the fly as you change the parameters of what you want to look at.)

Discussion
Discussion of radiative transfer in AMUSE. Two different interfaces, one for radiation hydro, one for post-processing? Do want the full-detail code to be able to do radiation hydro just in case you've got the computer time to be able to do it and blow lots of time on the radiative transfer.

Architecture of an N-body Simualtor
Mark Ridler, talking about his simualtor GravSim.

Mark has a hsitory in computer science, works as a program manager at a software company. Has an interest in physics; for the last year, he's been developing an N-body simualtor based on Piet's and Jun's ACS, ported to C++.

Intended as intro to N-body systems. Written in Ruby. Has plug-and-play integrators. Individual time-step code.

Integrators: Mark likes Hermite. How does it compare to 2nd order Leapfrog? Goes to his website to show ... oops, the wireless is down. Well, there's an article on his website that has comparisons.

He ports to C++ because it's higher performance than Ruby. Also opens up other possibilites. Ported most of the world6.rb code as-is. The multi-step integrator didn't quite make it. Some of the code has been refactored, e.g.:


 * Calculates many-at-a-time rather than 1-at-a-time
 * Transaction model commits changes together
 * WorldLine limited to last 2 points only. (If you don't limit it, you eventually use up all your memory....)
 * Mass moved from WorldPoint to Particle
 * Body now has:
 * a. Particle... which represents the present
 * b. WorldLine... which represents the past

Architecture:

Simualtor class, is_a Model, which has_N Body. Model is static; the simulator has the concept of time. The simulator uses components. Primarily in integrator. Optinally uses a Divider and an analyser. The Divider is a space divider like an octree or a kdtree. He introduced an architecture where you can put a space divider together with an interactor which builds a tree code; you can build other types of tree codes down the road. Analyzer is where you analyze two different 2-body orbits, or N-body regularization. If you're not happy with the general purpose integrator, you can make decisions.

Particle: has_a WorldPoint (time, pos, vel, acc, jerk). Composite (r>0, multicomponents) and Body (r=0, 1 coponent) is_a Particle, Body has_a World Line, WorldLine has_2 WorldPoint. Composties come in when you are dealing with a tree code.

Model: Plummer, Uniform, each is_a model; can make other models. Model has Body(n). can generate, load, save, and has various statistics.

Integrator has step, predict, interpolate. Various algorithms.

Divider; Tree is_a Divider. Octree, KDTree, CurveTree is_a Tree. Plug in other space dividers? (CurveTree is something he's been experimenting with.)

Interactor; has calc_gravity(Composite *). Direct and Pairwise are two interactors that he has.

Analyzer, just recently started to add in; has method 2_body_regularisation. One is Kepler, one is Kustaanheimo-Stiefel. Only so far has worked on a basic Kepler scheme for elliptical orbits.

Shows some examples.

Interlude
Lunch. Hear a talk about protiens in cell membranes that was mystifying to a guy who thinks the Universe is made up of Dark Matter, Dark Energy, and Baryons. Antoher talk about Bathrodesy, whereby Kip Throne talks about confirming that the massive dark objects at the cores of galaxies are really black holes via the gravitational signature of eating a smaller black hole.

Radiative transfer modeling on AU scales of IR molecular lines from Protoplanetary Disks
Rowin Meijerink, Caltech

Motivation
Most planets are found between .01 AU to 10 AU. You start with a disk, planets form from that. How can this region be observed?

Continuum (IR thermal, scattered); atmoic lines (OI), Molecular lines (IR/submm/mm).

IR lines, around 4.7 microns. Carbon Monoxide. Several lines. Longer wavelengths, Spitzer (R~500, or 600 km/s) detected all kinds of molecules. Lots of water lines, but some other things. Preliminary analysis indicate that the water lines arise from warm gass (500-1000K) inside 2AU. Looking at gas close to the star, but in the planet forming region (where Earth and things like that form).

Radiative Transfer
To interpret all this data, one needs radiative transfer. To date, lots of interpretations are done assming LTE and doing slab models. They are pretty succesful in fitting the line data, at least in a small part of the spectrum. Rowin is going to argue that we need to do more than that.... For instance, we want to extract parameters like the density distribution, temperature distribution, abundance structure, etc. For some other molecules, there may be chemical formation thingies we want to learn about.

Scheme for modellling. First, 2d Dust radiation transfer (RADMC) by Dullemond &amp; Turolla (1998). Get temperature of dust disk, and mean intensity in every part of the grid. Images and spectra rendered in 3D. Could also do the step in between which is the 1+1D non-LTE detailed balance calculation. All of this is in preparation for Herschel/JWST/E-ELT.

Code
Flow chart :

|--  | Dust radiation transfer |    Dust temp and mean intensity calculated |--      |      |       |      |    --    3d distribution of       |      >| Non LTE excitation calculation |    Level populations |          --       |                     |       |                     |              -       >|Ray Tracing|     Spectrum/image rendered -

What's special about RADLite? Very fast. Note that in the inner part of disks, there can be a huge velocity gradient; can have km/s differences from one cell to the next. However, the intrinsic line width may be of order 10 m/s. As such, very different velocities that play an important role. Classical codes integrate from boundary to boundary, may completely miss a line embedded well inside the cell, and you'll misestimate flux, shape, of line. RADLite subsamples in each gridcell with a sufficient number of points across the line. Very important for inner part of disks. Code is optimized to figure out where the line is and then sample it.

A spectrum and iamge velocity cube of a line at 3 km/s resolution is rendered in 15-30 seconds on a single 3 GHz Intel Xeon processor.

This code can handles orders of magnitude in velocities. Whole spectrum is thousands of km/s; line may be just a couple km/s in the outer part of the disk. (Example: CO ro-vibrational bands.)

LTE slab vs. Non-LTE 2d models
Although a 2-parameter model can give good fits to H$2$O line observations in MIR Spitzer IRS band, this doesn't mean that the model is correct. Slab models ignore: complex geometries, densities ranging from n<10$3$ to 10$16$ cm$-3$, and temperature T~100-5000K. Upper level energies range from E 5000 cm$-1$, and transitions are scattered throughout spectrum. Finally LTE only holds when collisions dominate level populations.

Beta3D code... used in 1D. Can apply in 3D in finite time if you considered 10 levels, but in the IR here there are few hundred levels....

Shows a plot of Non-LTE/LTE column density ratios. At 0.3-1 AU, around 2000-3000 cm$-1$ energies, differences get mongo.

Take into account that below a certain temperature (dependent on density), the water freezes out onto grains. Gas temperature in disk surface is decoupled from dust due to heating by FUV, X-rays, ro both. Gas to dust ratio is larger than the ISM value due to dust settling.

Taking into account everything he considers, he shows that he's able to bracket the observations. (Doesn't try to do a fit yet, just showing that he can bracket the parameter space.) Current study is only qualitative....

Current investigations include intrinsic line width (currently domianted by thermal broadenign); stellar properties; inner holes; other molecular species.

= Wednesday 10/07 =

Morning talks: RKNOP FAIL
Apologies; my laptop died, so I failed to take notes for the talks today. Let me especially apologize to the people who talked today :(

Then, I realized how dependent I was on my memorized passwords, as I forgot my Wiki password and eventually resorted to SQL to reset it....

Discussion about Hydro
We all agree that getting hydro and radiative transfer into AMUSE is harder than the other parts, because it hasn't been done yet, and we don't know fully how to do it....

Vincent Icke is talking about what the theory group is doing with hydro simulations here at Leiden. In particular, he's showing simulations of high-mass binary stars that have mass loss, the idea being that this is relevant to Eta Carina. (2D simulation.)

Vincent will lead the discussion. He has a few points that he thinks are relevant when talking about the amalgomation of hydro and N-body. (There was some discussion.)


 * Need CPU time matching (approximately equal time on radiative, gas, and nobdy) to be fair to all
 * if you want to do generic computations. In certain conditions you may say that you don't care that you spend 99% on radiative transfer, if that's what you need.
 * some things may parallelize better than other things; wallclock vs. cpu time
 * Vincent says his motivation is first generation stars
 * Resolution doesn't scale the same as the CPU time in hydro and nbody; physically, you might want to match resolution.
 * Applications : stellar clusters, galacitc centers/nuclei, etc.
 * Initial conditions: need to match
 * 3-dimensional hydro visualization.... Vincent has experimented with OpenGL, Visit (?), but for one reason or another has still found it very difficult to find a proper representation of a 3 dimensional hydro visualization
 * Vincent says he hasn't specifically mentioned which hydro code we're going to use. He says there are some excellent battle-tested codes out there.

Another question: interface. Consider gridbangers vs. SPH; can the same interface even work with them? Vincent says that the 3d codes that are worth anything at all are all AMR codes. He says that the criterion in hydro for refining the grid is different from what you'd do in an n-body point mass distribution. His feeling is that this can be solved. He thinks that the difficulty will be not to write code based on good physics how to distribute stuff, but things like memory assignment, local vs. global memory, communication problems. He claims to be a total ignoramus in this respect....

N-body: want to localize the point with the highest instantaneous acceleration. In hydro, want to localize the points with the highest gradient. In radiative transfer, want to localize the points with the highest gradient in optical depth. Drastically different again if you include ionization. However, since these are well-known and well-formulated bits of physics, we'll find a way of dealing with them.

Inti addresses the interface and the choice of codes. Different hydro methods represent different data in different ways. With gravity, you can plug in different modules and compare methods directly. Is this possible in hydro?

Obstacles to interface:


 * Representation of data : SPH vs. Grid vs. AMR vs. Voronoi vs. etc.
 * Initial and boundary conditions. Often codes have these compiled in to the executable.  This will be a barrier of swapping things through.  Can there even be a generic way of specifying ICs and BCs that you can use even most of the time?  Cumbersome if you have to recompile for every problem....

Evghenii says that the BCs is going to be the most difficult case.

Ultimately you only care about the physical boundary conditions; what's internal to the code is the code's problem. Inti asks if there is a universal description of physical boundary conditions that can be sent to codes... whether or not the codes can do that will depend on the code. (E.g., is it enough to specify "an open boundary", or is it method dependent? To some dependent it's method dependent.)

Vincent says we're dealing with open source code; recompiling his code, anyway, is fast. Is that even a problem? AMUSE in principle could compile the code. That does make implementation of the interface more work. (Inti says that that was one way to solve the multiple instance problem for f2py.) It can be done, and can be hidden from the user.

Vincent suggests in AMUSE there be a two stream approach. The first things we do are going to be for professionals only; the people who will use it first will know how to deal with the various arcane things. Only after all of that has been worked out will the more general population of scientists be able to run it with some ease.

Vincent asks if anybody here as ever run simulaneous hydro and n-body; it's done all the time with SPH, and there are feedback mechanisms and that kind of thing.

Has anybody run an amalgomated radiative transfer and hydro code? Raidiation hydro problems are being done as integrated codes, using a simple but efficient radiative transfer code. (Where these things born in separate families are married? In the particular case under discussion, it was designed from the start to be this.)  Pluto is a code that was made in Italy to do this; simple to use, very versatile.

FLASH is another example of a modular code that includes lots of stuff.

Perhaps Hydro code, being so different from N-body, eneds a different way of doing things. Consider, for example, running hydro code, and sending it queries, what is the temperature, density, etc., at this point? (Apparently sometimes in SPH folks will "secretly" pretend that the SPH particles are "real" rather than carriers of a function.) (Does it even make sense to do this with SPH?  The interface may need to be adaptive, so it knows particles vs. fields for SPH vs. mesh.)

Head<---BASH--->Wall

Gravity solvers.... Will local gravity be applicable for grid based methods? It creates extra difficulties.

Inti says it's important not to see the codes as solvers. Evolvers rather than solvers.

Lunch.

CONCLUSION: It's hard

Summary:
 * Probably we should draw a line between particle and grid based methods. (Two interfaces?)
 * May have to accept recopmiling for IC and BC
 * What would be a good first "test case" problem to define the minimum requirements? Box with an outflow?  Outflow from a binary?  Outflow from stars, open boundary conditions, no self-gravity in the gas?  Think about it this afternoon.

Vincent guesses looking at it from the outside that the physical precision produced by N-body is higher than what's been done in hydro and in radiative transfer; he says perhaps take cue from N-body as a result. Perhaps intially we'll have to live with a system that the N-body folks might consider a little trivial (smaller number of stars). Perhaps instead of full radiative transfer, put in some perscription for feedback of stars into gas.

= Slides of the presentations =
 * Evghenii Gaburov
 * Rowin Meijerink
 * Seyit Hocuk