drizzle
Profile
Search
 
Hosted by The Rackspace Cloud

The Drizzle project is planning to take part in the Google Summer of Code 2010 program. This program endeavors to fund students to contribute to an open source project over the summer break.

Contents

Submission Template

Project Title

Description: a paragraph describing the project

Suggested background/interests: Provide a brief description of what kind of technologies (such as programming language) the student should be have a bit of knowledge in.

Mentors: Mentors for this project

Students: The students the project has been assigned to

Project Ideas

A template is provided below to use for listing a project. Feel free to modify the template and add more project suggestions.

Implement Table Elimination in Drizzle

Description: Table elimination is a feature that was added to the optimizer in MariaDB. It would be nice to add this feature to Drizzle. It may not be too difficult to do right now since the optimizer in Drizzle is not significantly different from the optimizer in MariaDB yet.

More in-depth information on the implementation in MariaDB is available at: http://askmonty.org/worklog/Server-Sprint/?tid=17

This project would involve first coming up with an approach to implementing the table elimination feature in Drizzle. Test cases should then be created which demonstrate cases where table elimination could be used before the feature is implemented. Finally, the feature will be implemented and tested.

Suggested background/interests: Some knowledge of C++ would be useful for this project along with a general interest in databases and query optimization in particular.

Mentors: Padraig O'Sullivan (posulliv on IRC)

Students: available

Replication Plugin for Replicating from Drizzle to Memcached

Description: Drizzle has an extremely simple to understand replication API and all interesting replication work happens in the form of a plugin.

This project involves developing a replication plugin for drizzle that replicates to a memcached server. I have developed a simple version of this previously and it could be used as a stepping stone for a potential student in this project. Much more could be investigated however, in particular, how data is stored in memcached i.e. what key to use to store data and what format to store the data in.

More information on replication in drizzle is available on Jay Pipes blog. In particular, have a look at the following post - http://www.jpipes.com/index.php?/archives/302-Drizzle-Replication-Changes-in-API-to-support-Group-Commit.html

This project could be really interesting in my opinion and it is the something that the community at large would be very interested in seeing the results of.

Suggested background/interests: Some knowledge of C++ would be useful for this project. A perspective student should also become familiar with memcached and the replication API in drizzle (read the links in the description paragraph above).

Mentors: Padraig O'Sullivan (posulliv on IRC)

Students: spsneo is interested.

Status: Djellel Eddine Difallah is working on this. [see proposal] http://drizzle.org/wiki/GSOC_Replication_Plugin_for_Replicating_from_Drizzle_to_Memcached_jalel

Development on the Boots Command Line Tool

Description: Boots is a new project that provides the next generation command line tool for not just Drizzle, but for other database systems as well. It aims to provide the same flexibility as the Drizzle server in terms of modularity, making it easy to extend with new functionality. The project has the core set of requirements met, but there is still much to do. Specific tasks can be discussed and there should be enough time to implement a few. You can read more about the project and specific tasks that need to be completed (blueprints) at: https://launchpad.net/boots

Suggested background/interests: Knowledge or Python, SQL, and general relational database concepts are a must. It would also be beneficial to know how to manage and process large amounts of data efficiently, as this tool is used for dumping and importing large data sets. Various tasks will benefit from other areas of knowledge, for example, the curses or Python GUI interface options would benefit from experience with these.

Mentors: Eric Day (eday on IRC)

Students: Ashish Sharma is intrested. available, could accept more than one student

Development on Table Functions for the Information Schema

Description: Drizzle uses functions to implement its data dictionary. The goal of this project is to develop a set of plugins for the information_schema that matches the ANSI SQL standard. There are 47 tables in total in the standard. Once these tables are built it should be possible for any ANSI designed tool to be able to talk to Drizzle.

Suggested background/interests: Knowledge in C++ and some SQL. A good knowledge of the STL is not required, but it will make the task much easier.

Mentors: Brian Aker (krow on IRC)

Students: kumar lav is interested (lafau on IRC); arpit is interested(email:arpit4maheshwari@gmail.com)

Replacement Memory Engine

Description: The current Memory engine has many different limitations. Total object size is less then what can be done with a blob, the memory allocation system is based on C block allocator which does not work with C++, and it is a table level locking engine. The goal will be to redesign the engine for both heap and range indexes, use a modern allocator, and to allow it to swap blocks of data out to disk as required.

Suggested background/interests: Knowledge in C++ and a strong understanding of data structures.

Mentors: Brian Aker (krow on IRC)

Students: available, could accept more than one student

Port mtr2 to Drizzle

Description: Drizzle uses version 1 of the "mysql-test-run" program. In the MySQL tree, there is a much improved "version 2". We have since made changes to our version of mysql-test-run (now drizzle-test-run) that need to be ported over (so it's not a straight file copy). Someone working on this may also want to work on the other testing task.

Suggested backgrounds/interests: Perl Programming, Knowledge of SQL. C/C++ not required, but good Perl hacking is.

Mentors: Stewart Smith (stewart on IRC)

Students: available, probably only 1 required.

Make test runner installable

Description: Drizzle uses a testing system derived from MySQL's which is written in Perl. It is currently structured such that both the test cases and the test-running system must live in the tree. To support out-of-tree plugins, it would be great if Drizzle installed the test-runner and that runner would be able to operate on any collection of test cases in a given directory. This will require cleaning up the perl so that proper modules are installed and also cleaning up some of the path-based assumptions in the code. Anyone wanting to work on this should consider also working on mtr2 above.

Suggested backgrounds/interests: Perl Programming, Knowledge of SQL. C/C++ not required, but good Perl hacking is.

Mentors: Monty Taylor (mtaylor on IRC)

Students: available, probably only 1 required.

Develop Replacement Testing Framework

Description: Create a replacement for the combination of test-run.pl and drizzletest.cc that are currently used to run tests in the tree. In order to prevent scope-creep, this system should consume the .test files in the tree currently and produce/test against .result files. However, at this point a single system, preferrably written in Python to match the rest of the utility programs in the tree, could do the job of both test-run.pl and drizzletest.cc and at the same time be more maintainable.

Suggested background/interests: Testing. Python programming. Ability to read Perl and C.

Mentors: Monty Taylor (mtaylor on IRC)

Students: available, 1 or 2 could work on this

Develop Out-of-Tree Plugin Testing Framework

Description: Create a testing framework (in a scripting language such as Python) to test out-of-tree plugins against a Drizzle server. Anyone wanting to work on this should also consider looking at the three above tasks. For instance, if the current test-run system is able to be installed and run in an arbitrary directory, then this task is unneeded. On the other hand, if a new test-runner system is developed, if it is designed properly it could satisfy both this and that task.

Suggested background/interests: Investigate what PECL and CPAN do for testing out-of-tree extensions and plugins.

Mentors: Jay Pipes, Monty Taylor

Students: Available (1 or more students)

Filesystem storage engine

Description: like http://code.google.com/p/mysql-filesystem-engine/ - write a storage engine for Drizzle that reads data straight out of the file system, to be used for reading /proc entries and others.

Suggested backgrounds/interests: C++ Programming

Mentors: Stewart Smith (stewart on IRC)

Students: maqingli is interested(email:qinglics@gmail.com).available, arpit is also interested(email:arpit4maheshwari@gmail.com) ,deepak (deed1234@gmail.com)

Column based ARCHIVE Storage Engine

Description: The ARCHIVE storage engine compresses one row after another. This project is to have an ARCHIVE storage engine that supports having one file per column. The design is pretty simple: one file contains the NULL bitmap of the rows. There is also one file for each column. For columns that aren't NULL, the next entry in the file for that column is read. This approach should be faster for certain read queries for ARCHIVE like tables as well as giving us a simple storage engine to explore the needs of column stores with.

Suggested backgrounds/interests: C++ programming, C programming, data storage/compression.

Mentors: Stewart Smith (stewart on IRC)

Students: Vijay Samuel (vjsamuel on IRC) is interested.

Cloud based storage Storage Engine

Description: Like http://fallenpegasus.com/code/mysql-awss3/ , write a Storage Engine (or adapt awss3) for Drizzle - allowing storing and retrieving of blobs from Cloud Storage services such as S3 and Rackspace Files.

Suggested backgrounds/interests: C++ programming, cloud storage

Mentors: Stewart Smith (stewart on IRC)

Students': available

MySQL to Drizzle Syntax Rewriter Plugin

Description: A query rewriter which would enable SQL statements written in MySQL's SQL syntax to be translated into Drizzle's SQL syntax

Suggested background/interests: The plugin could be based on the existing query rewriter plugins and be a very useful tool to aid in migrations and testing of Drizzle.

Mentors: Jay Pipes

Students: neh is interested in this!


george is interested in this!

Ioana1 is interested in this!

Available (1 or more students)

Add a Proper Unit Testing Framework to Drizzle

Description: Add a unit testing framework to Drizzle and create unit test cases for all public APIs at the very least

Suggested background/interests: GTest is probably the best all-around unit-testing framework for C++; possibile to investigate others

Mentors: Jay Pipes

Students: PaulB is interested in this

Status : Pawel Blokus is working on this.

Available (1 or more students)


Does GTest here refer to the google test framework or the Gnome test framework? I don't think the Gnome framework is really suitable for drizzle. The Google one might be ok, but CPPUnit will probably integrate better with Hudson as it already has support for Junit stil XML output via the subunit integrate that dtr has. -- RobertCollins

Develop SphinxSE Storage Engine for FULLTEXT

Description: Develop a storage engine plugin for the Sphinx fulltext search engine.

Suggested background/interests: Easiest approach would be to study the existing MySQL SphinxSE plugin and work with mentors to adapt for Drizzle. Hardest part will be re-implementing the MATCH ... AGAINST syntax.

Mentors: Many available...

Students: The students the project has been assigned to

IPv4, IPv6, MACADDR Native Column Types

Description: We removed the UNSIGNED column type that came from MySQL. As a consequence, IPv4 addresses need to be stored in BIGINT columns or have application-level hacks to display them as UNSIGNED. This will require some refactoring to be done on the Field system. See here for more information:

https://lists.launchpad.net/drizzle-discuss/msg06283.html

This project would add IPv4 and possibly IPv6 column types to Drizzle so that a Drizzle user could simply type:

CREATE TABLE sessions (
  id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY
, ip_address IPv4 NOT NULL
);

Extra credit: Add in INET_NTOA() and INET_ATON() functions from MySQL:

http://dev.mysql.com/doc/refman/5.1/en/miscellaneous-functions.html

Suggested backgrounds/interests: C++ programming

Mentors: Many available

Students: available

Contacting the Mentors

The mailing list is the best way to express interest in a project. The mentors listed next to the above project ideas are also available on IRC most days. Their IRC handle is listed next to their name above.

Proposal Guidelines

Students are responsible for writing a proposal and submitting it to Google before the application deadline. The following outline was adapted from the Perl Foundation open source proposal HOWTO. A strong proposal will include:

We would prefer that development discussion occur on our project mailing lists when possible, with special recognition being given to those students who vet their proposal with community developers before submitting their proposal to Google SoC. This is not required, but can have a large impact on the chances of your proposal being accepted, so please don't be shy. In any case, you will be required to keep open lines of communication with your mentor should you be accepted, so if you have circumstances that may affect this, please explain them up front in your proposal.

Acceptance Criteria

Generally we look for a genuine enthusiasm and initiative from student applications. Some specific criteria that we will be using when going through student proposals are:

Previously Accepted Projects

The entire project abstracts can be found on Google's site here.

Frequently Asked Questions

Am I eligible?

Please see the Google Eligibility FAQ for all questions about eligibility.

When is the proposal deadline?

According to the Summer of Code FAQ, the deadline for submitting student proposals is April 9th, 2010 (19:00 UTC). Please remember that proposals must submitted to Google themselves, although we are happy to discuss any proposals with you ahead of time.

Useful Links

Retrieved from "http://drizzle.org/wiki/Soc"