Scienfific psych open source workbench: Java developers wanted.
This is indeed a post to attract volunteers for a scientific workbench (also posted this message on sourceforge as an idea, not yet a project as such and on DevNetwork.net). It is NOT a requirement that you have a background in cognitive science, psych or anything alike.
This message is really lengthy. It has 2 sections: section 1 for those not familiar with the brain and its processes. Section 2 for those who are. Section 3: all the rest that is important for the tool (scroll way down).
SECTION 1: for those not familiar with brain processes:
Metaneva is a workbench (open source) that tries to figure out how the brain works. Why and which areas do whatever they do. Much is already known about the brain as such, however really detailed information is missing. We use a special way of looking at these processes.
We encode conducted experiments into a DB (MySQL). We already have the upload form. We use and develop an ontology for this DB. This ontology is in alpha.
Once there is data, the most common query will be 'give me all articles where you find these stimuli in combination with these tasks' and show me what areas are active.
Another highly interesting one is: fill in a virtual experiment and check whether there is a similar experiment in the DB.
3. What information is gained?
It helps to analyze functions
It helps to refine brain area functions (what does this area do?)
It counters redundant experiments (by first testing them whether our DB holds such an experiment
4. Who do we need?
Coders, coding language depends on the volunteers. We used to code in PHP, but Java or any other language might be more appropriate for the bench.
It's open source and we have as such clusters. There are those who provide the queries and the concepts of the general howto. Coders have a very important role in determining whether we have to reorganize or adjust the queries.
Coders have full control over language, and preference for which query they want to code. We have many queries that are possible candidates and those who are willing can pick whatever they want to code.
We also might need a programmer coordinator. Needless to say that we do not have a formal structure as such.
SECTION 2: for those familiar with brain processes:
1. About the Project:
a functional (meta-)analysis tool targeting the cognitive scientist who is interested in animal data analysis.
A little more:
I assume that this post does not need a full explanation of the tool, since it would need quite some pages to explain the what, how and why. Let's just say that we have very strong methodological arguments indicating the need and usefulness of our tool. If you have any background in cognitive science than our tool tries to map paradigm data with functional attribution based on paradigm elements instead of on researcher functional bias. If you do not have any background, than you could describe our tool as follows: functional attribution (e.g. brain area X does function Z: orbitofrontal does 'decision coding' in the brain) is now based on interpretations of previous studies (e.g. a study that said this brain area does this). But, the actual data (e.g. the specific experiment with specific stimuli, tasks, species...) is lost, since for a human it is impossible to hold all data of every experiment. So, a researcher does the following: remember the conclusions of previous research (e.g. orbitofrontal does decision), build his own experiment (does the orbitofrontal also do this particular decision) and create his own conclusions.
Informatics can off course do a more thorough analysis, by mimicking this individual researcher tactics and add a impressive data source to it (incorporate paradigm data such as stimuli....). We know our first queries and are already writing out the second queries. This simply means: we (conceptually) know how to mimic research tactics for 2 queries and are awaiting for translating them into codes. Do note that this is a novel idea for cognitive science, but is in fact a widely common practice for e.g. biology or chemistry. As such psych is behind the overall evolution towards informatics based analysis. Hence, we would be possibly be the first psych tool and we already have interest of people. But, as usual, we must provide results before one actually believes our potential. And needless to say that we would be thrilled to create a high level open source project (which is not very common in a field where projects are often measured on the basis of funding and spin-offs). Read more on our open source motivation later on this post.
We are a small group of enthusiasts and volunteers. The core team holds 3 people, with 2 active consultants and various 'interested people'. So in comparison to other neuroinformatics teams, this is really small. But, there is no need to create a huge team since our workbench is rather simple in comparison with other neuroinformatics tools. The beauty of this project is that it is in fact rather simple, but has a lot of power when it comes down to its results. The actual 'idea' behind it is not highly complex, Translating it into coding, well you guys be the judge of that.
We use animal data and with reason (again methodological reasons, so won't go into detail). This animal data is off course supposed to be integrated with human data. So there is a direct relevancy for human data, however cognitive sciences currently have no other option than focussing on specific species before moving to cross-species comparisons.
We target cognitive scientists interested in functional analysis. We do not integrate 'raw' data (for now) such as e.g. raw spike rates or raw fMRI data.
- Create an open source, open knowledge expert system:
There are a few trends within the source code of scientific informatics. One is to claim it is free/open yet is locked and only free for uploading data. Another is to lock it and fund it with uni fundings. Another is to actually open it up.
We strongly support the open source policy, creating a tool that is freely accessible, using the GNU (or licenses alike). It is often seen that data sharing is not happily done, since sharing data is often considered losing the scope and losing funding. This also means that progress is made on a team basis. If a team makes progress and releases a new edition, the community gets an update. We want to avoid team based progress and make it a community driven workbench, since it will benefit more from sharing that it would from locking.
This is connected to our strong conviction that knowledge should not be locked. Psych is the science where we try to describe mechanism to cure, treat individuals. Locking such a code would only mean locking knowledge and goes right against this principle. Knowledge is to be shared, and progress can only be made when everybody is able to use and gain insights. We do not support knowledge filtered for those who can afford it.
- Counter redundant research in animal testing
Currently there are research teams that use monkeys, rats, birds... to test various behavioral mechanisms. It is currently not a common practice to 'avoid' animal testing. Do realize that such experiments use invasive techniques and that animals are used for the general good. We would like to maximize such data sets and integrate all experiments into our workbench. Maximizing would reduce redundant research and avoid experiments that are already done but are not know by the researcher. Why are they not integrated: simply because there is no tool allowing for such integration. There are a lot of experiments out there, but no tool allows for detailed integration of the data sets. The best on e can do is query scientific databases using keywords (decision, rat, single unit recordings...). When you realize that psych does not have a controlled ontology (vocabulary), it should be obvious that currently one simply use their own proper keywords. With the consequence that data sets sometimes becomes obscure since nobody searches articles based on the attributed keywords (indeed, we have 'trends' in keywords). Hence, a lot of animal data is lost, and that's one of the motivation to build this tool.
- Avoid group member errors:
Freely distributed analysis to allow the scientific community to check and correct possible mistakes. Not only the core developers can comment or improve the code, but an entire community can. This is strongly connected to open knowledge and the evolution towards a community driven tool. This is new for the community, but at the same time the stronghold of the tool.
I will be short in the description of practical goals since more can be gained on request. This post is intended to tickle peoples interests and not to present a full reports of the practical goals.
- create functional queries:
Creating PHP queries to gather information regarding functional mechanisms. This is our first and elementary query of the data. It will be used to evaluate future data uploads and analysis.
- sparsity of the brain:
Query and test whether the brain has 'functional areas' in the first place. This relies strongly on previous, and is motivated by criticism of psychs who do not believe that effective connectivity is equal to functional connectivity.
- test run experiments
Enter an experiment that you would test, and check whether is already done by someone else (which you missed during your literature study). This is one of the more obvious 'avoid redundancy' results.
- Compute functional networks
Using previous short term goals, we can connect functional mechanism. E.g. how does elementary mechanisms ('color perception', 'object recognition'...) integrate with higher order processes ('attention', 'decision', 'emotion'...). Is it top-down, bottom-up, waterfall....
- Cross-species comparisons
Compare human and non-human data with a data driven analysis. Avoid using 'functional interpretation' (e.g. this is decision) and only use data (stimulli, task...) to compare animals. This allows for highly detailed analysis and might correct common accepted (yet theoretically not proven) assumptions.
3. Open Sourcing/ Open Knowledge.
I already stressed our motivation towards open knowledge, so if you're convinced that we are dedicated, do skip this section. But I make it another section of this post, since we do want to be very clear on this.
- Why do we focus so much on it? Well, we do not want to be forced running to funding and create a closed project since this will only benefit few and destroy some of its primary goals (e.g. open knowledge). There is a strong tendency within the psych informatics to create standards that will make a lot of money. To become the standard in the field would potentially create lucrative spin-offs but counter community driven knowledge. We do not support his.
- There is a tendency to open up knowledge, and create data sharing among the community. However, this does not go all that well. It needs a stronger boost and hence the open source community is more than welcome in these evolutions.
- Biology benefits from the open sourcing, hence psych should be no different.
- Create a tool accessible to all, regardless of funding within the particular institution
4. Relevancy: fundamental or real live research?
Quite a tough question really. Direct relevancy/effect will be the possibility to avoid redundant research, to compare animal and human data, to derive functional processes in the brain, to stimulate the functional mapping and hence stimulate e.g. surgery to know which areas does what (and what to remove or not), to understand psych affections and treat previously untreatable research... but do know that we are far from understanding all that happens in the brain. It's fundamental alright, but not the type where one can not see its direct relevancy. Knowing how it works, enables knowing how to treat that what goes wrong.
It will not enable you to tune your car, to pimp your ipod or to fill up your bathtub from behind your computer (unless we unravel all and connect it to our bluetooth). But it might help tuning your partner in tweaking its decisions (scan, redirect and you're done), but than again that would not be that cool/friendly neither. Hej, it is research as such, and has a direct relevancy. At least, that's what we believe.
5. Who/what do we need?
There is no need to filter for those with a background in neuroinformatics, cognitive sciences or neuroscience. We provide the psych background, and explain what we need. It is even said in various papers that a truly multidisciplinary team (as in 'I do not know your field that well, but based on your info I would think that...') is the best way of organizing and creating such a project. We strongly need multidisciplinary 'interested' people, but do not exclude those who do not know that much about cognitive science. Being interested is the only criterion.
Apart from that, we need coders who are experienced in PHP/MySQL. We decided to go for a web-based tool (see further this post) and use PHP/MySQL and possibly Java for it. Off course, we can revert these decision if the coders recommend other (better) approaches.
Support is strongly needed for (PHP) query coding, and minor support needed for CMS maintenance and graphic design of the tool (forms...). We already have a graphic designer who can create the look&feel, but can always use help regarding translating these into (HTML) GUI.
Even if you are interested but not sure if you might be able to help, drop a line and we'll see how we might be able to use your help.
6. What did we already agreed upon (languages, structure...)?
- We decide in the beginning to create a web-based workbench, making it easily accessible. Hence we decided to use MySQL (MyIsam) and PHP.
- We reached our first milestone by creating and organizing how to add data to the DB. We created the DB structure, and consider is a stable structure. It is able to integrate a very wide variety of experiments and is easily extendible. Reaching this first milestone was the criterion before going to the open source community and open it up. Without such structure, there is no use of attracting additional members.
SECTION 3: Other important stuff:
1. Current state of the project.
- 3 core people: 1 DB maintainer and upload forms (Nikos), 1 query coder (David), 1 consultant for data mining (Remco), neuropsych consultants (Jan and Matthew) and 1 coordinator/data uploading (me).
- Host: shared server
- Languages: mysql/php EDITWe are currently revising for a migration to Java
- SVN: under development
- Website: As a new member on this forum I can not post the url, but check the email and go to that one. Not to offend someone (by referring to the url) but to give additional info for interested people.
- Currently developing a experimental paradigm ontology (a standard controlled ontology for the cognitive sciences)
2. Heck, are you guys volunteers?
Yes, we are. Some of us are academics (the 'psych bunch'), others are volunteers with no academic position. Our group strongly believes in open sourcing and try to reach its goals. Some might focus on the animal ethics. Others are interested in the scientific goals, others are simply interested in the complex data mining.
All are nevertheless volunteers, since open sourcing such a novel approach does not stimulate funding (and funds are under pressure lately). And we do not want to go to private funds, since it would compromise the goals.
3. How to contact us?
Drop a line on this post, pm me or send an email to collab(at)metaneva (dot) org.
Do not hesitate to ask further questions or comments.
If you are interested in collaborating, do mention your specialty, and give us some idea on what we can expect. This is mainly to be able to judge and organize the collab.