Website van Eke van Batenburg

Research projects

During my work at "theoretical Biology" the most enjoyable research projects were in colloboration with other scientists.
They contributed their expertise in their own particularfield of biology, my contribution as a "renegate biologist" was the computative aspect.

All in all these projects covered a wide variety of topics.
In the following sections I present a sample of research topics in more detail.

This page will present details about the following research projects...


  Adaptability over generations
    with B. Zwaan

Butterflies of the Bicyclus family live in an environment which can be similar to that of their children, but can be very different from the environment of their grand-children and great grand-children.
If they adapt too well, their grand-children may have problems to adapt to their own new environment.

For example, at one moment in time, all butterflies live in the wet season and have their wing patterns and behaviour adapted to the characteristics of the wet season. However, their grand-children and great grand-children may live in the dry season which requires another wing pattern and a different behaviour.

If the butterfly is too enthousiastic to adapt to its environment to the point of losing the genes that are profitable for the other season, following generations will be in trouble.

We are interested to know what parameters/parametervalues are essential to survive such environmental changing environments.

  APL/CSMP: Continuous simulation

In Biology, simulation is a very important research method. For simulation, two types of simulation languages are useful tools: continuous simulation languages and discrete simulation languages.

For continuous simulation there are ample languages available: from "prehistoric" languages for mainframe computers like DYNAMO, CSMP and ISIM, to modern visually oriented, click and point languages like Madonna and Vensim.

I found these modern languages easy to start with, but rather tedious in the long run. For an experienced user, a textual representation is easier to enter and better to comprehend. Although the big "picture" is a bit easier to grasp in a visual representation, text appears much better for details while still satisfactory for the big overview. Furthermore I was often disappointed in the numerical capabilities of the existing simulation languages.

For this reason I started to develop a continuous simulation tool in APL. APL is a programming language that is very versatile and powerful. For keywords I used mainly those of CSMP. This resulted in a powerful and still familiar language.
An experienced CSMP programmer will immediately be familiar with the design of my program. He will notice one different thing though: an arrow (→) in front of the keywords INITIAL, DYNAMIC, TERMINAL and END. See the following example.

The current status of the project is that everything works fine for my applications. A few things that are available in CSMP (such as variable step integration) are still missing because my need was not yet sufficiently strong to spend time on. Finally, I have not had the time (yet) to document the product extensively to make it interesting for second parties. These chores will be done in the future depending on the priorities at my department.

  APL/GPSS: Discrete simulation

As said before, simulation is a very important research method. For discrete simulation there are also ample languages available: from "prehistoric" languages for mainframe computers like GPSS, SIMSCRIPT, SLAM and SIMULA, to modern visually oriented, click and point languages like Arena and AweSim.

As explained in the previous section about APL/CSMP my enthousiasm for the visual programming methodology waned in the long run. I found that for an experienced user, a textual representation easier to enter and better to grasp.

For this reason I started to develop a discrete simulation tool in APL. APL is a programming language that is very versatile and powerful. For keywords and concepts I used mainly those of GPSS. This resulted in a powerful and still familiar language. See the following example.

The current status of the project is that everything works fine for my applications. There are many more SNA's available in this implementation than in GPSS. For example "AP 3" yields a vector with parameter 3 values of all entities and AP 4 5 6 yields a table with parameter 3,4,5 values for all entities. Such vectors and tables are easily analysed using APL's powerful primitives.
On the other hand, a few things that are available in GPSS (such as all features of PREEMPT) are still missing because my need was not yet sufficiently strong. Also, I have not had the time (yet) to document the product extensively to make it interesting for second parties. These chores will in the future depending on the priorities at my department.
  • F.H.D. van Batenburg & H.P.T.v.d.Star & M.van Baaren (1989): Discrete simulation by APL-GPSS: the best of two worlds. APL Quote Quad Conf.Proc.(1989)12-22.
  • F.H.D. van Batenburg & J.C.van Lenteren & J.J.M.van Alphen & K.Bakker (1983): Searching for and parasitization of larvae of Drosophila Melanogaster: a Monte Carlo simulation model and the real situation. Neth.J.of Zoology 33(3)306-336

  APL hybrid simulation

Occasionaly I needed to simulate the values of continuously changing physiological levels while using a discrete simulation. Having my own language for discrete simulation available in APL I added a few new blocks that I propose to add to GPSS for hybride simulation.

  • Continuous Assign: →CASSIGN A,B
    Once an entity passes CASSIGN, its parameter A will be continuously (at each time change) updated to value B (fields A and B may be expressions).
    The entering entity passes, because CASSIGN does not halt entities.
    The continuous assignment for parameter A stops when that entity passes a block CASSIGN A (no B-field).

    Example of continuously assignment of time to parameter P1:
    →CASSIGN 3,C 1
    Example where P5 (parameter 5) of entity 2 is assigned to P4 of passing entity:
    →CASSIGN 4,(AP 5)[2]

  • Continuous Integration: →CASSIGN A,INTGRL B
    When the B-field of CASSIGN contains the word INTGRL, the value of the A-parameter is continuously integrated by the B-expression.

    →CASSIGN 4,INTGRL(0.3x(P 4)*0.5)-P 2

  • Continuous Test: →CTEST_X A,B,C
    Here X stands for LT, GT, EQ, LE or GE.
    Once an entity passes a CTEST, the relation A X B is tested continuously, even though that entity is no longer in the CTEST-block.
    Like CASSIGN, the CTEST does not halt entities; they enter and continue immediately to the next block.
    Once the test is true for a particular entity, it is removed from wherever it may dwell and sent to the block specified in C. This will also stop the CTEST activity for that entity.

    Example to sent an entity to label "period" as soon as its first parameter (P1) equals 1:
    →CTEST_EQ (P 1),3,period
    Notice that each passing entity can have a different test depending on the value in its parameter 1.

    Another way to stop the CTEST activity is if that entity passes a block "CTEST A,B".
    Here A is a label that refers to the activating CTEST-block. The B-field can be omitted; if supplied, not only the passing entity, but all entities of group B will deactivate their continous test.

      jmp:→CTEST_LT 5,(P 6),period
      →CTEST jmp

  Behavioral analysis
    with P.Haccou and E.Meelis of mathematical biology

The study of animal (and human) behaviour requires specific statistical techniques that are not that accessible in literature and are seldom available in general statistical packages.

To remedy this, our statisticians P.Haccou and E.Meelis compiled existing techniques and developed new ones that were published in a book "Statistical Analysis of Behavioural Data". The majority of those techniques were programmed in APL in the statistical package "The Analyst".

Some examples are:

  • detection of outliers
  • estimation of transition between acts
  • splitting and lumping of acts
  • detection of time-inhomogeneity
  • analysis of effects of time-dependent covariates
  • goodness-of-fit tests
  • Haccou,P. & Meelis,E.(1992): Statistical analysis of behavioural data. Oxford Univ.Press.

  Biological Pest control
    with Phan Van Lap of Hanoi University

In 1992 and 1994 I was invited to Hanoi University (Vietnam) to teach bio-informatics and to assist in organising a department of bioinformatics.

A practical application that we developed with the staff was a simulation study in APL programming language. This simulation study investigated the effects of various parameters governing a system using "sterile male" pest control.

  Immunology: proteasome functioning
    with N.Beekman, F. Ossendorp & C. Melief of "bloedbank"

The immunological system has an interesting mechanism to recognise foreign cells. A particular organel "the proteasome" cuts proteins in small fragments. Some of those fragments move to the cell wall and stick to it, the so-called CTL epitopes. The lymphocytes use these CTL epitopes for proper identification.

Our current research concentrates on identifying how and where the proteasome cuts those proteins. Our approach is to develop a computer program that explores potential rules and compares their effects with experimental data.

  Parasitism/predation: Biological pest control
    with J. van Alphen e.a. of oecology dept.

In Holland a lot of vegetables are grown in greenhouses. In Holland a lot of vegetables are grown in greenhouses. In these conditions pests are a nuisance that need a constant fight. As people are more and more concerned about ecological problems, farmers are seeking new ways to fight this war with less chemicals.

One of the weapons that are becoming more and more popular is biological warfare where natural enemies of the pest are deployed. This requires a subtle balance of powers: if the enemy doesn't kill enough of the pests it is no good, if it kills all, it will die (or it will not have offspring) and a stray pest animal may start a new plague.

In a joined project with the dept of animal oecology, the various parameters that govern the effectiveness of parasite-prey relations were studied. Comparing simulations programmed in APL with the real population revealed various deviations. This pointed to hitherto unsuspected mechanisms that were involved.

  • Batenburg,F.H.D.v. & Lenteren,J.C.v. & Alphen,J.J.M.v. & Bakker,K. (1983): Searching for and parasitization of larvae of ...; a MonteCarlo simulation model and the real world. Neth.J.of Zoology 33(3)306-336.
  • Turlings,T.C.J. & Batenburg,F.H.D.van & Strien-v.Liempt, Why is there no interspecific host discrimination in the two coexisting larval parasitoids.... Oecologia 1985(67)352-359.

  Plant physiology: growth of roothairs
    with J.Kijne e.a. of Plant Physiology

Papilionaceous plants are unique in that they can assimilate nitrogen from the air. Due to this extremely important feature, those plants are often used in third-world countries to fertilise the soil. They are planted and the full-grown crop is ploughed in the soil before a commercially more interesting crop is planted.

The special little factories that fix the nitrogen are located on the small hairs that extend outwards on the roots. Those root hairs start to form aberrant curls that finally distort the hair to a small "dumpling" where bacteria do the nitrogen fixation.

In a joint project with the dept. of Molecular biology, several alternative ideas that could explain the curling mechanism were proposed. Those ideas were tested by various simulation models in APL. This proved that neither excretion of stimulation auxines, nor excretion of inhibiting auxines could account for this phenomenon. The most plausible explanation was that a distribution of bacteria should be responsible.

  • Batenburg,F.H.D.van & Kijne,J.W. & Iren,F.van & Libbenga,K.R. (1983): Simulation of marked root hair curling in Rhizobium-lugume symbiosis. Physiol.Plant 1983(59)363-368.
  • Batenburg,F.H.D.van & Jonker,R. & Kijne,J.W. (1986): Rhizobium induced marked roothair curling by redirection of tip growth: a computer simulation. Physiol.plant 1986(66)476-480.

  Phylogenetics: snail coiling (chirality)
    with E.Gittenberger of theoretical biology & phylogenetics

Although "the evolution theory" is the basic paradigm for biologists, there are still white spots on the evolution map that are not explained. There are also "grey" spots were we don't know what happens exactly yet, although we think that the final answer will fit nicely in the evolutionary theories.

One such grey spots is the surprise that nearly all snails have shells that are right-winded spirals. More surprising is that some snail species that have such so-called dextral (right coiled) individuals, occasionally also have mirror-image sinistral (left-coiled) populations in remote areas. As it can be proven that the "majority wins over the minority" (no philosophical or ethical argument, but a mathematical one) it is puzzling how such a minority can originate and stays relatively permanent afterwards.

In a joined project with the dept. of Evolution & taxonomy we investigated what subtle mix of mutation frequency, isolation forces, migration speed and fitness pressure can be responsible for this.

An APL simulation model proved that the maternal effect (a characteristic of the child ,here the coiling direction, is determined by its mothers genes only) that many evolutionary biologists hold as the most plausible explanation of speciation in some particular circumstances, is not instrumental at all.

  • Batenburg,F.H.D.van & E. Gittenberger(1996): Ease of fixation of a change in coiling: computer experiments on chirality in snails. Heredity (1996) 76: 278-286.

  Evolutie van Proteine structure
    with J.P.Abrahams e.a. of BioChemistry

International scientific teams that are focused upon the determinations of complete genomes, generate enormous amounts of data with proteine sequences. Biophysical techniques allow can reveal proteine 3-dimensional structures and current computer techniques enable us to view, manipulate and investigate those structures on screen.

In a combined effort ofthe Institute of Theoretical Biology and the research school BIOMAC, we will investigate the evolutionary aspects of proteine structure, initially by focusing on a small group of proteines with a substantial amount of the amino-acid cysteïne.

1) First of all we will develop software to analyse the amino-acid order in such proteines in order to make guesses about relationships with other proteines and predict the most probable covalent structure of internal cysteïne cross-couplings.
2) After making a survey of the known structures we will look for relations and patterns in covalente cysteïne couplings.

We intend to develop software that can predict the three-dimensional structure using the primary structure of a peptide with S-S-couplings.

    Coöperating members:
  • Prof.J.P.Abrahams (Biofysische Structuurchemie)
  • Dr.M.Overhand, Dr.G.A.van der Marel, Prof.J.H.van Boom (Bio-organische Synthese)
  • Dr. F.H.D.van Batenburg, S.Ypma (Bio-informatika)

    with C.Pleij of Leiden Institute Molecular Biology & S.Gultyaev

RNA sequences form secondary structures that are known as hairpins, internal loops, bulges, multiple loops and pseudoknots.
Pseudoknots were first discovered by C.Pleij in plant RNAs in 1982. Since then many more have been found.

The secondary (and tertiary) structure is extremely important. Knowing the secondary structure of RNA can help in understanding the function of certain RNA and non-function of related (mutated) sequences; it can explain phylogenetic similarities and dissimilarities; it can explain why some parts are easily mutated and others not; and so on.

One way to study pseudoknots is the theoretical approach.
Pseudoknots cannot be predicted by programs that use the minimal energy criterium, but our program STAR (STructure Analysis of Rna) uses different algorithms that enable us to predict pseudoknots. One algorithm is "greedy algorithm", an alternative is the "stochastic algorithm" and the last, and most interesting, algorithm is the "evolutionary algorithm". See next paragraph for our project "Rna Folding: predicting secondary structure".

Theoretical studies must be critically reviewed in the light of experimental data. In this project "PseudoBase" the Institute of Theoretical Biology and the Leiden Institute of Molecular Biology decided to build a database of pseudoknots that are experimentally proven.

  • One can consult this database.
  • One can submit proven pseudo knots.

To access this database: PseudoBase.

  • Batenburg,F.H.D.van & Gultyaev,S. & Pleij,C.W.A. (2000): Pseudobase: a database with RNA pseudoknots. Nucl.Acids Res. 2000,28(1)201-204
  • Batenburg,F.H.D.van & Gultyaev,S. & Pleij,C.W.A. (2001): Pseudobase: structural information on RNA pseudoknots. Nucl.Acids Res. 2001,29(1)194-195

  Rna Folding: predicting secondary structure
    with C.Pleij e.a. of Molecular Biology & S.Gultyaev

Life depends for many essential processes on RNA (RiboNucleic Acid). Roughly speaking, each RNA molecule is a unique sequence of many nucleotides based on 4 different nucleotides. This is called, the primary RNA structure. This sequence determines how the RNA functions, but not exclusively so. It was discovered that RNAs are not loosely floating strings in the cell, but that in each RNA various parts glue together, forming a uniquely structured bun. Such a bun consists of structures known as hairpins, internal loops, bulges, multiple loops. This so-called secondary structure (and the tertiary structure based on it) highly influences the functioning of the RNA.

The most frequently used approach to determine the secondary structure of RNA are minimal energy algorithms (see for example MFOLD by Zuker. Such programs require a lot of computer power (speed as well as memory), and predict roughly 3/4 of the secondary structure correctly.

In a joint project with the Institute of Molecular Biology we programmed 3 completely different algorithms: 1. greedy algorithm, 2. stochastic algorithm and 3. evolutionary algorithm. The latter simulates the folding process, rather than predicting "the" final folding. This APL program allows scientists to predict the structure on small PC's in a very reasonable time (in the nineties it took several hours on a standard PC to predict the structure for a RNA of 300 nucleotides). Such predictions are often (although not always) better than minimal energy predictions.
For shorthand we called this program STAR (STructure Analysis of Rna).

Doing a simulation rather than a prediction, allows us to study the RNA dynamics. For example we can investigate the folding of the RNA molecule during its synthesis, thereby predicting intermediate foldings and in some cases functionally important metastable states.

Most programs use the minimal-energy-algorithm to predict RNA structure. Unfortunately, this algorithm can not predict pseudoknots. However, our three algorithms can predict pseudoknots.

So the two main advantages of our program STAR are that 1) it simulates the folding pathway and 2) that it can predict pseudoknots.

  • Abrahams,J.P. & Berg,M.v.d. & Batenburg,E.v. & Pleij,C. (1990): Prediction of RNA secondary structure, including pseudoknotting, by computer simulation. Nucl. Acids. Res. 18(10)3035-3044.
  • Batenburg,F.H.D.van & A.P.Gultyaev & C.W.A.Pleij (1995): An APL-programmed genetic algorithm for the prediction of RNA secondary structure. J.theor.Biol. (1995) 174: 269-280.
  • Gultyaev,A.P. & F.H.D.van Batenburg & C.W.A.Pleij (1995): The computer simulation of RNA folding pathways using a genetic algorithm. J.Mol.Biol. (1995) 250: 37-51.

  Speciation through geographical isolation only
    with J.A.J.Metz and E.Gittenberger

In this study we use simulation as a tool for our investigations. Our simulations showed that segregation of a homogeneous population into two groups will result in a divergence of characteristics. This is expected and known as genetic drift.

Our simulation studies show that removing the dispersal barrier will quickly homogenise the two previously different groups into one. Additionally we discovered however that some circumstances could prevent this process. We are now investigating which parameters are responsible for this phenomenon.

  Sex determination in plants
    with T. de Jong

In several di-oecious plants the mother can influence the sex-ratio of her offspring. Rather than produce 50% of each, the mother can disperse for example 60% female seeds and 40% male seeds. In this study we use simulation as a tool to investigate the best strategy for plants. Two principles are important for the optimal sex ratio strategy of plants.

  1. Sib mating.
    Because seed dispersal is restricted, sib mating may occur which selects for a female bias in the seed sex ratio.
  2. Local Resource Competition (LRC).
    If a plant produces pollen its nuclear genes are dispersed in two steps: first through the pollen and then, if the pollen is successful in fertilising an ovule on another plant, through the seed. If the plant produces an ovule, its genes are dispersed only through the seed.
    By making pollen instead of ovules the offspring of a single plant is then spread out over a wider area. This reduces the chance that genetically related individuals are close together and need to compete for the same local resource (LRC). The effect is strongest if pollen is dispersed over a much wider area than seeds. Less LRC for paternally versus maternally derived offspring selects for a male bias in sex allocation.

Using simulation, we study the above-mentioned opposite effects in dioecious plants (with separate male and female individuals), with maternal control over the sex ratio (fraction males) in the seeds.
In a two-dimensional spatial model female-biased sex ratios are found when both pollen and seed dispersal are severely restricted.
If pollen disperses over a wider area than seeds, which is probably the common situation in plants, the seed sex ratio becomes male-biased. If pollen and seeds are dispersed over a wider area, the sex ratio approaches 0.5.
Our results do not change if the offspring of brother-sister matings are less fit due to inbreeding depression.

  • Jong, & Batenburg,F.H.D.van & Dijk,J.van:
    Seed sex ratio in dioecious plants depends on relative dispersal of pollen and seeds: an example using a checkerboard simulation model.
    J.Evol.Biol. 2002(15)1-7

   Why sex? Against Muller's ratchet?
    with L. Beukeboom

Organism's that reproduce without having sex suffer from Muller's ratchet. That is, deleterious mutations will in the long run detoriate the quality of the gene-pool of a population. Although back-mutations may occur, their incidence is so much smaller that slowly, but inexorably the quality of the gene pool diminishes.

Having sex is the answer because this can compensate for the losses.

Some asexual organisms are sperm-dependent and accept small pieces of parental genome, so-called microchromosomes. Some organisms even take genome pieces from another species (for example the Amazone molly Poelilia formosa that mates with P.mexicana or with P.latipinna). There are some discussions whether some organisms can transmit these microchromosomes to their children for a limited number of generations.

In this project, we investigate if these strategies hamper or may even arrest Mullers ratchet. It is evident that such micro-chromosomes protect the mutated counterparts, thereby slowing the speed of Muller's ratchet.

Our simulation results were surprising. These micro-chromosomes did protect. However, Muller's ratchet went faster instead of slower. Apparently such parental genome pieces act as a type of protecting umbrella that allowes many deleterious mutations to remain in the population that would have been eradicated otherwise by selection.

  • Beukeboom,L. & Batenburg,F.H.D.van: (in prep.)