This article describes the implementation of C1DBSpellChecker, a utility that performs spell-checking on OleDb databases including SqlServer and Access.
The user specifies a connection string, then selects a table and one or more string fields on the table to spell-check. Typing errors are displayed in a grid where they can be corrected. When all corrections have been made, the changes are saved back to the database.
Spell-checking is an important tool used by most people when creating documents of all kinds, from formal reports to simple e-mail messages. Many popular applications provide built-in spell checking that makes this task easy and almost automatic.
Unfortunately, not all applications have built-in spell-checking. Visual Studio is a good example. Developers use Visual Studio to create applications, web pages, and documentation. Unless they have a package such as ComponentOne IntelliSpell, it is likely that spelling mistakes will creep into their work and eventually surface on a web site or commercial application.
Even developers that use ComponentOne IntelliSpell often rely on content that is stored in databases and hasn't been properly spell-checked. For example, the popular AdventureWorks database contains typos such as:
Not very professional on a catalog or web site... Surprisingly, there are few or no tools for spell-checking databases. The C1DBSpellChecker application was designed to fill this need.
The C1DBSpellChecker application performs the following tasks:
These tasks are described in the following sections.
The first step required to spell-check a database is selecting the database to use. This is done by specifying a connection string. We use the ADODB and MSDASC libraries to accomplish this. These libraries are provided by Microsoft and can be freely distributed. They provide a user interface for creating and editing OleDb connection strings.
We decided to use the OleDb data provider because is provides great flexibility, allowing connections to Sql Server, Access, and many others.
The code used to get and edit the connection strings is simple (the version below omits error checking code for clarity, please refer to the source for a more complete version):
// prompt user for a connection string string PromptConnectionString(string connString) { // create objects we'll need var dlinks = new MSDASC.DataLinksClass(); var conn = new ADODB.ConnectionClass(); // show connection picker dialog object obj = conn; dlinks.hWnd = (int)Handle; if (dlinks.PromptEdit(ref obj)) { connString = conn.ConnectionString; } // done return connString; }
Connection strings created by the user are added to a ComboBox
and saved as
part of the application settings, so they can be reused across sessions. The code that
performs this task is listed below:
// form loaded: load recently used connection strings protected override void OnLoad(EventArgs e) { var mru = Properties.Settings.Default.RecentConnections; if (mru != null) { foreach (string connString in mru) { _cmbConnString.Items.Add(connString); } } base.OnLoad(e); } // form closing: save recently used connection strings protected override void OnFormClosing(FormClosingEventArgs e) { var mru = new System.Collections.Specialized.StringCollection(); foreach (string item in _cmbConnString.Items) { mru.Add(item); } Properties.Settings.Default.RecentConnections = mru; Properties.Settings.Default.Save(); base.OnFormClosing(e); }
Because the connection strings are fairly long, the application uses the owner-draw feature
of the ComboBox
control to trim the connection strings when they are displayed
in the drop down. This makes it a lot easier for users to find the connections they are
looking for.
The owner-draw code for the ComboBox
is as follows:
public Form1()) { InitializeComponent(); // make combo owner-drawn var cmb = _cmbConnString.ComboBox; cmb.DrawMode = DrawMode.OwnerDrawFixed; cmb.DrawItem += cmb_DrawItem; } // trim items in combo using ellipsis (they're very long) void cmb_DrawItem(object sender, DrawItemEventArgs e) { var fmt = new StringFormat(); fmt.LineAlignment = StringAlignment.Center; fmt.Trimming = StringTrimming.EllipsisPath; var text = (string)_cmbConnString.Items[e.Index]; text = TrimConnectionString(text); var brush = (e.State & DrawItemState.Selected) != 0 ? SystemBrushes.HighlightText : SystemBrushes.WindowText; e.DrawBackground(); e.Graphics.DrawString(text, _cmbConnString.Font, brush, e.Bounds, fmt); e.DrawFocusRectangle(); } // trim connection string for display string[] _keys = new string[] { "Provider", "Initial Catalog", "Data Source" }; string TrimConnectionString(string text) { var sb = new StringBuilder(); foreach (var item in text.Split(';')) { foreach (var key in _keys) { if (item.IndexOf(key, StringComparison.InvariantCultureIgnoreCase) > -1) { if (sb.Length > 0) { sb.Append("..."); } sb.Append(item.Split('=')[1].Trim()); } } } return sb.ToString(); }
The code works by splitting the connection string into key/value pairs and then keeping only the parts that help identify the connection. Configuration options are removed for clarity. For example, here is a typical OleDb connection string in all its glory:
Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=AdventureWorks;Data Source=spock\sqlexpress
And here is the trimmed version:
SQLOLEDB.1...AdventureWorks...spock\sqlexpress
The trimmed version shows only the information that is relevant for selecting an entry from a long list.
Once the user picks a connection string, the next step is to use the connection to obtain a database schema. The schema is a list of the tables, fields, and relations in the database. It describes the structure of the database.
We retrieve the database schema using an auxiliary class called OleSDbSchema
.
This class extends the system DataSet
class with a ConnectionString
property. Setting this property populates the OleSDbSchema
with tables that
have the same structure as the tables in the database (but no data).
The code below shows how the OleSDbSchema
class obtains the database
schema (this version is simplified for clarity; please refer to the source code for
a complete version, including code that retrieves constraints, relations, and
stored procedures):
// Gets or sets the connection string used to fills the schema. public string ConnectionString { get { return _connString; } set { if (value != _connString) { _connString = value; GetSchema(); } } } void GetSchema() { // initialize this DataSet this.Reset(); // go get the schema EnforceConstraints = false; using (var conn = new OleDbConnection(connString)) { conn.Open(); GetTables(conn); conn.Close(); } } void GetTables(OleDbConnection conn) { // add tables var dt = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null); foreach (DataRow dr in dt.Rows) { // get type (table/view) var type = (string)dr[TABLE_TYPE]; if (type != TABLE && type != VIEW && type != LINK) { continue; } // create table var name = (string)dr[TABLE_NAME]; var table = new DataTable(name); table.ExtendedProperties[TABLE_TYPE] = type; // save definition in extended properties foreach (DataColumn col in dt.Columns) { table.ExtendedProperties[col.ColumnName] = dr[col]; } // get table schema and add to collection try { var select = GetSelectStatement(table); var da = new OleDbDataAdapter(select, conn); da.FillSchema(table, SchemaType.Mapped); Tables.Add(table); } catch { } } }
The code shows how tables and columns are created. Notice that the
ExtendedProperties property is used to store additional
information about each element. This allows us for example to distinguish
between regular tables, views, and stored procedures, since all these
elements are represented by DataTable
objects within the
OleDbSchema
.
The
OleDbSchema
class is described in detail in a separate article.
We only use its basic features in this project.
Once the schema has been obtained, it is used to populate a
TreeView
control. The TreeView
has nodes that represent
the tables in the database, and each node has child nodes that represent the fields.
Only string fields are included, since they are the only ones eligible for spell checking.
This is the code that populates the TreeView
control:
// update table tree to reflect new connection string void UpdateTableTree() { // initialize table tree TreeNodeCollection nodes = _treeTables.Nodes; nodes.Clear(); // populate using current schema _treeTables.BeginUpdate(); foreach (DataTable dt in _schema.Tables) { if (_schema.GetTableType(dt) == TableType.Table) { // create new node, save table in tag property var node = new TreeNode(dt.TableName); node.Tag = dt; // add string fields to node foreach (DataColumn col in dt.Columns) { if (col.DataType == typeof(string)) { var ndCol = node.Nodes.Add(col.ColumnName); ndCol.Tag = col; } } // add new node to the tree if (node.Nodes.Count > 0) { nodes.Add(node); } } } // done _treeTables.Sort(); _treeTables.EndUpdate(); }
The TreeView
has check boxes next to each table and field. The check boxes require
a fair amount of code to work in an intuitive, automatic manner. The code must ensure that only
one table is selected, and is must handle the check boxes next to tables and fields. Specifically,
checking a table automatically checks all its fields and un-checks all other tables. Un-checking
a table un-checks all its fields. And checking a field automatically checks the parent table and
un-checks all other tables.
Here is the code that manages the check boxes:
// handle check boxes void _treeTables_AfterCheck(object sender, TreeViewEventArgs e) { if (_updatingTree) { return; } // start updating... _updatingTree = true; // get node that was clicked var n = e.Node; // clicked on table node if (n.Tag is DataTable) { // apply check state to all child nodes (fields) SetCheck(n, n.Checked); // if this table is checked, uncheck all other tables if (n.Checked) { foreach (TreeNode c in n.TreeView.Nodes) { if (c != n) { SetCheck(c, false); } } } } // clicked on column node if (n.Tag is DataColumn) { // update parent node state bool check = false; foreach (TreeNode c in n.Parent.Nodes) { check |= c.Checked; } n.Parent.Checked = check; // if this node is checked, uncheck all other tables if (n.Checked) { foreach (TreeNode c in n.TreeView.Nodes) { if (c != n.Parent) { SetCheck(c, false); } } } } // done updating... _updatingTree = false; } void SetCheck(TreeNode n, bool check) { n.Checked = check; foreach (TreeNode c in n.Nodes) { c.Checked = check; } }
Once the user has selected one or more fields to spell-check, we need to read the
data, spell-check each field, and keep only the records that have spelling
errors in them. we use an OleDbReader
to read the records and
a C1SpellChecker
to spell-check each one. Records that contain
spelling errors are stored in a DataTable
to be edited by the user.
Here is the code that reads the data and performs the spell-checking:
// get data table with the rows that contain spelling errors DataTable GetSpellingErrors(OleDbDataAdapter da, List<string> columns) { // create table and adapter var dt = new DataTable(); // get table schema da.FillSchema(dt, SchemaType.Mapped); // make columns not being spell-checked read-only foreach (DataColumn col in dt.Columns) { if (!columns.Contains(col.ColumnName)) { col.ReadOnly = true; } } // read rows with DataReader var cmd = da.SelectCommand; cmd.Connection.Open(); using (var reader = cmd.ExecuteReader()) { while (reader.Read()) { // read a row var dr = dt.NewRow(); foreach (DataColumn col in dt.Columns) { var index = col.Ordinal; dr[index] = reader.GetValue(index); } // check for errors bool hasErrors = false; foreach (string col in columns) { var text = dr[col] as string; if (!string.IsNullOrEmpty(text)) { var errors = _spell.CheckText(text); if (errors.Count > 0) { hasErrors = true; } } } // keep rows that have errors if (hasErrors) { dt.Rows.Add(dr); } } } // table has no changes dt.AcceptChanges(); // done return dt; }
A simpler but less efficient alternative would be to read all the data into the
DataTable
first, then spell check it and remove the rows that
don't contain any spelling mistakes. The version listed above is better because
rows without errors are discarded immediately and never added to the
DataTable
.
Now that we have a DataTable
with all the errors, the next step is
to show the table to the user so he can review and fix them. We use a
C1FlexGrid
control to do that. We use the grid's owner-draw
feature to mark the typos with the familiar red wavy underlines, and attach a
C1SpellChecker
to the grid editor so the user gets a nice context
menu with spelling suggestions while editing the cells.
Here is the code that binds the grid to the table containing the errors and implements the owner-draw logic:
// bind the grid and enable owner-draw to show the errors _flex.DataSource = _dtErrors; _flex.DrawMode = C1.Win.C1FlexGrid.DrawModeEnum.OwnerDraw; _flex.OwnerDrawCell += _flex_OwnerDrawCell; // draw wavy read lines under spelling errors void _flex_OwnerDrawCell(object sender, C1.Win.C1FlexGrid.OwnerDrawCellEventArgs e) { // spell check editable cells (unless we're just measuring) if (!e.Measuring && _flex.Cols[e.Col].AllowEditing && e.Row >= _flex.Rows.Fixed) { var text = _flex.GetDataDisplay(e.Row, e.Col); var errors = _spell.CheckText(text); if (errors.Count > 0) { // draw cell as usual e.Style = _errorStyle; e.DrawCell(); // build list with error ranges var ranges = new CharacterRange[errors.Count]; for (int i = 0; i < errors.Count; i++) { ranges[i] = new CharacterRange( errors[i].Start, errors[i].Length); } // create StringFormat to locate the error ranges var sf = new StringFormat(e.Style.StringFormat); try { sf.SetMeasurableCharacterRanges(ranges); } catch { } // locate the error ranges var rc = e.Style.GetTextRectangle(e.Bounds, null); var rgns = e.Graphics.MeasureCharacterRanges( text, e.Style.Font, rc, sf); // draw wavy red underline for each range foreach (var rgn in rgns) { rc = Rectangle.Truncate(rgn.GetBounds(e.Graphics)); for (Point pt = new Point(rc.X, rc.Bottom); pt.X + 2 < rc.Right; pt.X += 4) { e.Graphics.DrawLines(Pens.Red, new Point[] { new Point(pt.X, pt.Y), new Point(pt.X + 2, pt.Y - 2), new Point(pt.X + 4, pt.Y) }); } } } } }
And here is the code that connects a C1SpellChecker
with the grid
editor so the user can see the underlines while editing the cell and get a nice
context menu with spelling suggestions and commands:
// connect event handler to customize the grid editor _flex.SetupEditor += _flex_SetupEditor; // enable spell checker in cell editor void _flex_SetupEditor(object sender, RowColEventArgs e) { var tb = _flex.Editor as TextBox; if (tb != null) { _spell.SetActiveSpellChecking(tb, true); } }
The form also contains a button that performs a modal (dialog-based) spell check over the entire grid. That can be more convenient and comfortable than searching for the errors by inspecting each grid cell visually.
Implementing the modal check option is easy:
// perform modal check on the grid private void _btnSpell_Click(object sender, EventArgs e) { var cols = new List<string>(); foreach (Column col in _flex.Cols) { if (col.AllowEditing) { cols.Add(col.Name); } } var speller = new FlexGridSpellChecker(_flex, cols.ToArray()); _spell.CheckControl(speller); }
The code starts by building a list with the columns that are to be spell-checked.
Then is uses the list to build a FlexGridSpellChecker
object that
implements the spell-checking interface required by the C1SpellChecker
control on behalf of the grid. The FlexGridSpellChecker
implementation
is not especially interesting, so we won't reproduce it here. Please refer to the
source code if you are interested in the details.
Once the user has reviewed and corrected the errors, the only task remaining is
saving the changes back into the database. To do this, we need an
OleDbDataAdapter
object with select and update commands. The
select command is used to retrieve the data, and the update command is
used to write changes back into the database.
Here is the code that creates the OleDbDataAdapter
and writes
the changes back into the database:
// save changes in the data table void SaveChanges() { try { // build DataAdapter with select and update commands var sql = _schema.GetSelectStatement(table); var da = new OleDbDataAdapter(sql, ConnectionString); var cmdBuilder = new OleDbCommandBuilder(da); da.UpdateCommand = cmdBuilder.GetUpdateCommand(); // save the changes da.Update(_dtErrors); // mark table as clean _dtErrors.AcceptChanges(); // disable save changes button until there are more changes _btnSaveChanges.Enabled = false; } catch (Exception x) { // something went wrong? inform the user var dr = MessageBox.Show( this, string.Format(Properties.Resources.FailedToSave, x.Message), Application.ProductName, MessageBoxButtons.OK, MessageBoxIcon.Error); } }
This concludes the last part of the application. We walked over the steps required to select the database, spell-check its content, display the errors, allow users to fix them, and save the corrections back to the database. It was a long and fun journey!
We developed the C1DBSpellChecker application with two objectives in mind:
We hope both objectives have been achieved. If you have requests or suggestions for improving this document or the application, please post on our site. Thanks in advance.