This is the workspace for Nathan Williams, who is working on the Google Summer of Code project to refactor and the custom List<> implementation with the STL's list, vector, and queue for Drizzle. Monty Taylor is mentoring Nathan.
Overall Project Goals
Identify Usage Locations and Patterns in Existing Kernel
Before we can begin refactoring, it's essential to get a documented list of the locations of where the custom List, I_List, List_iterator, and List_iterator_fast classes are used in the codebase. Once these locations are identified, we must figure out which of the STL's standard template containers, if any, are appropriate replacements. Each container has different usage and efficiency patterns, and therefore it is important to understand the existing usage patterns in order to properly pick a replacement from the standard library.
Performance Considerations
In order to measure any impact on performance, similar to what we saw with a regression on the STL bitset replacement, it is first necessary to have a baseline set of benchmarks at varying concurrency levels. Post bitset replacement, there is now a staging tree where an automated suite suite is run before any merges into the trunk. This project is container refactoring only with no new logic. This allows for small code changes with fast iterations cycles and, perhaps most importantly, small merges. I expect some regression to be seen as more areas of use are switched to standard containers due to the current implementations being intrusive and utilizing pool allocators. The harm or benefit of those facts will be seen once work is underway. When the project attacks is overall memory management strategy, those potential hits can be remedied with custom allocators.
Replace Major Custom List Class Implementations
For the major pieces identified above, refactor the kernel to use the STL's replacement container classes. Break this into multiple steps once they are identified above.
Time line and Deliverables
List and I_List Usage Analysis
- May 23 – June 1
- Identify usage locations, scenarios, and affected files to be re-factored.
Small Contribution
- June 1 – June 8
- Initial edit, test, and merge cycle to become familiarized with the process.
Iteration Cycle One
- June 8 – July 6
- Approximately half of identified usages re-factored, tested, and merged.
Iteration Cycle Two
- July 13 – August 10
- Remaining usages are re-factored, tested, and merged.
Final Review
- August 10 – August 17
- Past work reviewed again and modified as needed. Time permitting, additional custom containers are replaced.
Research
Prior Work
Replace-LIST - RedBrain's work earlier this year on tackling the list templates
Launchpad Blueprint
list_node
- Definition:
- Public Data:
- list_node* next - Link to next node
- void* info - The data being stored
- Public Interface:
- list_node(void *info_par,list_node *next_par)
- list_node()
- Sets info to null, next to this
- Only used with special sentinel node
- Notes:
- The actual node structure of the list
base_list
- Definition:
- Public Data:
- uint32_t elements - Number of elements in list
- Public Interface:
- base_list()
- Default constructor, initially empty
- base_list(const base_list &tmp)
- Copy Constructor
- performs shallow copy
- base_list(const base_list &rhs, MEM_ROOT *mem_root)
- Constructor
- Deep copies list, shallow copies elements
- void empty()
- Empties the list, removing all elements
- bool push_back(void *info)
- Creates a node using new, adds to end of list
- Returns allocation success/fail, though only by checking pointer returned from new
- try/catch or std::nothrow are not used, so exception would propogate up
- bool push_back(void *info, MEM_ROOT *mem_root)
- Creates a node using new with mem_root
- Returns allocation success/fail
- bool push_front(void *info)
- Creates a node using new, adds to front of list
- Returns allocation success/fail, though only by checking pointer returned from new
- try/catch or std::nothrow are not used, so exception would propogate up
- void remove(list_node **prev)
- Unlinks prev from the list
- void concat(base_list *list)
- Attaches parameter to end of current list
- void* pop()
- Removes first node
- Returns data it contains
- void disjoin(base_list *list)
- Inverse of concat, finds start of list in self and unlinks it
- void prepand(base_list *list)
- Inserts parameter before the first node in current list, and sets the the start of parameter as first
- void swap(base_list &rhs)
- Swaps contents of this and rhs
- list_node* last_node()
- Returns pointer to last node
- list_node* first_node()
- Returns pointer to first node
- void* head()
- Returns data of first node
- void** head_ref()
- Returns pointer to data of first node
- If list is empty, it returns 0
- bool is_empty()
- Returns whether or not list is empty
- list_node* last_ref()
- Returns pointer to the end of every list (the sentinel node)
- Notes:
- Take head of comment before shallow copy constructor, as it says some places rely on that behavior - though doesn't say where
- Though sentinel node is primarily an implementation detail, look for usages of last_ref
base_list_iterator
- Definition:
- Public Data:
- Public Interface:
- base_list_iterator()
- Default constructor, initializes all members to 0
- base_list_iterator(base_list &list_par)
- void init(base_list &list_par)
- Initializes iterator data
- Iterator is not ready to use, only ready after a call to next
- void *next()
- Points iterator to next node
- Returns data from that node
- void *next_fast()
- Points iterator to next node
- "Fast" because it doesn't set previous to current
- Returns data from that node
- void rewind()
- Points iterator to first element in list
- void *replace(void *element)
- Sets the nodes data to element
- Returns the old data
- void *replace(base_list &new_list)
- Inserts new_list into list at current iterator position
- Returns data iterator was pointing out (it was removed from list)
- void remove()
- Remove node from list that iterator is pointing at
- Iterator is invalidated
- void after(void *element)
- Insert element into list, after where iterator is pointing
- Iterator is incremented (now pointing to node containing element)
- void **ref()
- Returns pointer to pointer of data iterator is pointing at
- bool is_last()
- Returns true if iterator has reached the end of the list
- Notes:
List
- Definition:
- Public Data:
- Public Interface:
- List()
- List(const List<T> &tmp)
- List(const List<T> &tmp, MEM_ROOT *mem_root)
- bool push_back(T *a)
- bool push_back(T *a, MEM_ROOT *mem_root)
- base_list::push_back(a, mem_root)
- bool push_front(T *a)
- T* head()
- Returns base_list::head, casted to type T
- T** head_ref()
- Returns base_list::head_ref, casted to type T
- T* pop()
- Returns base_list::pop, casted to type T
- void concat(List<T> *list)
- void disjoin(List<T> *list)
- void prepand(List<T> *list)
- void delete_elements(void)
- Iterates of all nodes in list, calling delete on the stored data (ie, info pointer)
- List is empty after function returns
- Notes:
- Inherits from base_list
- Very simple wrapper providing type safety
- Only new method is delete_elements
- All other methods just call the ones from base_list, so they have the same quirks (ex, shallow copy)
List_iterator
- Definition:
- Public Data:
- Public Interface:
- List_iterator(List<T> &a)
- List_iterator() base_list_iterator()
- void init(List<T> &a)
- base_list_iterator::init(a)
- T* operator++(int)
- Returns base_list_iterator::next(), casted to T
- T *replace(T *a)
- Returns base_list_iterator::replace(a), casted to T
- T *replace(List<T> &a)
- Returns base_list_iterator::replace(a), casted to T
- void rewind(void)
- base_list_iterator::rewind()
- void remove()
- base_list_iterator::remove()
- void after(T *a)
- base_list_iterator::after(a)
- T** ref()
- Returns base_list_iterator::ref(), casted to T
- Notes:
- Inherits from base_list_iterator
- Simple wrapper providing type safety
List_iterator_fast
- Definition:
- Public Data:
- Public Interface:
- List_iterator_fast(List<T> &a)
- List_iterator_fast()
- void init(List<T> &a)
- base_list_iterator::init(a)
- T* operator++(int)
- Returns base_list_iterator::next_fast(), casted to T
- void rewind()
- base_list_iterator::rewind()
- void sublist(List<T> &list_arg, uint32_t el_arg)
- base_list_iterator::sublist(list_arg, el_arg)
- sublist is protected in base_list_iterator, but it is exposed publicly
- Sets the parameter list, ls, to be start with the iterators node and end with its current last
- Notes:
- Inherits from base_list_iterator
- This only calls the next_fast method, which doesn't update previous pointer
- Puts replace, remove, after, and ref methods into protected since they require the previous pointer
ilink
- Definition:
- Public Data:
- Public Interface:
- ilink()
- Default constructor, sets data to 0
- void unlink()
- Safely sets previous to next and sets data to 0
- ~ilink()
- Notes:
- A simple intrusive list which automatically removes element from list on delete
- For some reason, operators new and delete are overloaded for this struct but only call malloc/free. Left over from previous implementation?
base_ilist
- Definition:
- Public Data:
- Public Interface:
- void empty()
- Empties the list (sets first = last, last.prev = first)
- base_ilist()
- Default constructor, calls empty
- bool is_empty()
- Returns whether or not list is empty (first == &last)
- void append(ilink *a)
- Inserts a at the front of the list
- void push_back(ilink *a)
- Inserts a at the end of the list
- ilink* get()
- Removes first item from the list and returns it
- Returns 0 if list is empty
- ilink* head()
- Returns first item in the list
- Returns 0 if list is empty
- Notes:
- Copy constructor of this class does not create a usable copy, as its members may point at each other.
base_ilist_iterator
- Definition:
- Public Data:
- Public Interface:
- base_ilist_iterator(base_ilist &list_par)
- Initializes internal data
- Not ready for use until next is called
- void *next()
- Updates pointer to next
- If point to end, return 0
- Returns data it is pointing at
- Comment says: This is coded to allow push_back() while iterating
- Notes:
I_List
- Definition:
- Public Data:
- Public Interface:
- I_List()
- void empty()
- bool is_empty()
- Returns base_ilist::is_empty()
- void append(T* a)
- void push_back(T* a)
- T* get()
- Returns base_ilist::get(), casted to T
- T* head()
- Returns base_ilist::head(), casted to T
- Notes:
- Inherits from base_ilist
- No new methods
- Simple wrapper providing type safety
I_List_iterator
- Definition:
- Public Data:
- Public Interface:
- I_List_iterator(I_List<T> &a)
- T* operator++()
- Returns base_ilist_iterator::next(), casted to T
- Notes:
- Inherits from base_ilist_iterator
- No new methods
- Simple wrapper providing type safety
Current Usage
I_List
I_List<NAMED_LIST> key_caches
drizzled/drizzled.cc:364 (definition)
drizzled/set_var.cc:75 (extern)
I_List<thread_info> thread_infos
drizzled/show.cc:997 (local variable)
List
Foreign_key::ref_colums
List<Key_part_spec> ref_columns;
drizzled/foreign_key.h:53
Item_cond::list
List<Item> list;
drizzled/item/cmpfunc.h:1413
Item_equal::fields
List<Item_field> fields;
drizzled/item/cmpfunc.h:1538
COND_EQUAL::current_level
List<Item_equal> current_level;
drizzled/item/cmpfunc.h:1579
Key::columns
List<Key_part_spec> columns
drizzled/key.h:39
multi_update::fields, values
List <Item> *fields, *values
drizzled/multi_update.h:35
multi_update::fields_for_table, values_for_table
List <Item> **fields_for_table, **values_for_table
drizzled/multi_update.h:36
multi_update::unupdated_check_opt_tables
List <Table> unupdated_check_opt_tables;
drizzled/multi_update.h:42
nested_join_st::join_list
List<TableList> join_list;
drizzled/nested_join.h:31
nested_join_st::sj_outer_expr_list
List<Item> sj_outer_expr_list;
drizzled/nested_join.h:62
QUICK_INDEX_MERGE_SELECT::quick_selects
List<QUICK_RANGE_SELECT> quick_selects;
drizzled/opt_range.h:459
QUICK_ROR_INTERSECT_SELECT::quick_selects
List<QUICK_RANGE_SELECT> quick_selects;
drizzled/opt_range.h:518
QUICK_ROR_UNION_SELECT::quick_selects
List<QUICK_SELECT_I> quick_selects;
drizzled/opt_range.h:573
QUICK_GROUP_MIN_MAX_SELECT::min_functions
QUICK_GROUP_MIN_MAX_SELECT::max_functions
List<Item_sum> *min_functions, *max_functions
drizzled/opt_range.h:645
QUICK_SELECT_DESC
List<QUICK_RANGE> rev_ranges;
drizzled/opt_range.h:698
select_insert::fields
List<Item> *fields;
drizzled/select_insert.h:28
st_copy_info::update_fiels, update_values
List<Item> *update_fields, *update_valus
drizzled/session.h:82 ;
Session::warn_list
List <DRIZZLE_ERROR> warn_list;
drizzled/session.h:721
std_add_scheme_table::files
List<LEX_STRING> *files;
drizzled/show.cc:1977
Select_Lex_Unit::item_list
List<Item> item_list;
drizzled/sql_lex.h:323
Select_Lex_Unit::types
List<Item> types
drizzled/sql_lex.h:335
Select_Lex::item_list
drizzled/sql_lex.h:418 List<Item> item_list;
Select_Lex::interval_list
List<String> interval_list;
drizzled/sql_lex.h:419
Select_Lex::top_join_list
List<TableList> top_join_list;
drizzled/sql_lex.h:422
Select_Lex::join_list
List<TableList> *join_list;
drizzled/sql_lex.h:423
Select_Lex::sj_nests
List<TableList> sj_nests;
drizzled/sql_lex.h:425
Select_Lex::inner_refs_list
List<Item_outer_ref> inner_refs_list;
drizzled/sql_lex.h:467
Select_Lex::non_agg_fields
List<Item_field> non_agg_fields;
drizzled/sql_lex.h:485
Select_Lex::prev_join_using
List<String> * prev_join_using;
drizzled/sql_lex.h:502
Select_Lex::index_hints
List<Index_hint> *index_hints;
drizzled/sql_lex.h:631
Alter_Info::drop_list
List<Alter_drop> drop_list;
drizzled/sql_lex.h:671
Alter_Info::alter_list
List<Alter_column> alter_list;
drizzled/sql_lex.h:672
Alter_Info::key_list
List<Key> key_list;
drizzled/sql_lex.h:673
Alter_Info::create_list
List<Create_field> create_list;
drizzled/sql_lex.h:674
LEX::col_list
List<Key_part_spec> col_list;
drizzled/sql_lex.h:864
LEX::ref_;ost
List<Key_part_spec> ref_list;
drizzled/sql_lex.h:865
LEX::interval_list
List<String> interval_list;
drizzled/sql_lex.h:866
LEX::columns
List<Lex_Column> columns;
drizzled/sql_lex.h:867
LEX::insert_list, field_list, value_list, update_list
List<Item> *insert_list,field_list,value_list,update_list;
drizzled/sql_lex.h:868
LEX::many_values
List<List_item> many_values;
drizzled/sql_lex.h:869
LEX::var_list
List<set_var_base> var_list;
drizzled/sql_lex.h:870
LEX::param_list
List<Item_param> param_list;
drizzled/sql_lex.h:871
LEX::context_stack
List<Name_resolution_context> context_stack;
drizzled/sql_lex.h:885
st_rollup
List<Item> *fields;
drizzled/sql_select.h:376
JOIN::fields
List<Item> *fields;
drizzled/sql_select.h:427
JOIN::all_fields
List<Item> all_fields;
drizzled/sql_select.h:486
JOIN::tmp_all_fields1, tmp_all_fields2, tmp_all_fields3
List<Item> tmp_all_fields1, tmp_all_fields2, tmp_all_fields3;
drizzled/sql_select.h:488
JOIN::tmp_fields_list1, tmp_fields_list2, tmp_fields_list3
List<Item> tmp_fields_list1, tmp_fields_list2, tmp_fields_list3;
drizzled/sql_select.h:490
JOIN::fields_list
List<Item> &fields_list;
drizzled/sql_select.h:491
JOIN::join_list
List<TableList> *join_list;
drizzled/sql_select.h:498
YYSTYPE (union)
List<Item> *item_list
List<String> *string_list
drizzled/sql_yacc.h:948
drizzled/sql_yacc.cc:1321
st_foreign_key_info
List<LEX_STRING> foreign_fields, referenced_fields;
drizzled/table.h:402
TableList::join_using_felds
List<String> *join_using_fields;
drizzled/table_list.h:145
TableList::join_columns
List<Natural_join_column> *join_columns
drizzled/table_list.h:150
TableList::index_hints
List<Index_hint> *index_hints
drizzled/table_list.h:161
TableList::join_list
List<TableList> *join_list;
drizzled/table_list.h:197