[]
        
(Showing Draft Content)

Product Architecture

Packaging

GcWord is a collection of cross-platform .NET class libraries written in C#, that provides API to create DOCX/DOCM MS Word files from scratch. The library also allows to load, analyze and modify existing Word documents.

GcWord is compatible with .NET Core 2.x/3.x, .NET Standard 2.x, .NET Framework 4.6.1 or higher, and .NET 6 or higher.

GcWord and supporting packages are available on nuget.org:

Package

Description

GrapeCity.Documents.Word

Main package which automatically pulls in the other required infrastructure packages.

GrapeCity.Documents.Layout

Enables saving Word documents as PDF.

GrapeCity.Documents.Imaging

Provides image handling.

GrapeCity.Documents.Common

An infrastructure package used by other packages.

GrapeCity.Documents.Common.Windows

Provides support for font linking specified in the Windows registry. On a non-Windows system this library can be referenced, but will do nothing.

GrapeCity.Documents.DX.Windows

Provides access to the native graphics APIs when running on a Windows system.

Document Overview

A Word document in GcWord is represented by an instance of the GrapeCity.Documents.Word.GcWordDocument class.

The object model of the GcWordDocument class corresponds to the structure of a Word document, with the following properties corresponding to major parts of the document:

Property

Description

Body

The main document story

Styles

A collection of document styles to format document content

ListTemplates

A collection of list templates to format list content in the document

Settings

Provides options to control view, compatibility and other settings

Theme

Provides the different formatting options available to a document through a theme

CustomXMLParts

Provides the collection of CustomXMLPart objects.

GlossaryDocument

Provides the supplementary document storage which stores the content for future insertion.

Body

Body is the place where the content elements (representing the actual content of a document) are stored. GcWordDocument.Body represents the main content of the document, but other parts of the document (such as headers/footers, comments, footnotes/endnotes) also have bodies to store their content, the specific body type is indicated by the GrapeCity.Documents.Word.BodyType enumeration, which has the following members:

Member

Description

Main

Body of main document part

Header

Body of section header

Footer

Body of section footer

Comment

Body of comment

BuildingBlock

Body of building block

Footnote

Body of footnote

FootnoteSeparator

Body of footnote separator

FootnoteContinuationSeparator

Body of footnote continuation separator

FootnoteContinuationNotice

Body of footnote continuation notice

Endnote

Body of endnote

EndnoteSeparator

Body of endnote separator

EndnoteContinuationSeparator

Body of endnote continuation separator

EndnoteContinuationNotice

Body of endnote continuation notice

Unlike other body types, the main body has Sections as the top level content elements. It also contains comments, footnotes and endnotes collections. There are three types of content elements that can be stored in a body:

Content Element Type

Description

Content Elements

Block elements

Top level elements

  • Sections

  • tables

  • paragraphs

Inline elements

Elements that must be placed inside another elements

  • Runs

  • Texts

  • Pictures

  • Simple fields

  • Hyperlinks

  • Footnotes

  • Endnotes

Reference elements

Elements that do not have its own content in the body (except for complex fields, see Complex Fields) but are represented by start/end markers.

  • Bookmarks

  • Comments

  • Complex fields

The following sections explain how to access and work with various content elements of a body.

Range

A range is a sequence of content elements in a body. The body itself is a kind of range that holds all the content elements. In GcWord, the Range class is the main feature providing access to the various content elements in a document.

All content elements have the GetRange() method, using which it is possible to access and modify collections of elements of specific types inside the content element's range, since the Range object has properties returning collections of specific types of objects included in the range. These collections allow to add/insert elements using the Add() and Insert() methods.

Please note that adding or inserting always occurs on one or both (e.g. when replacing a range) of a range's boundary. It is not possible to insert something in the middle of a range without creating a range with a boundary on that position first.

A range provides the following two overloads to get new ranges based on it:

Method

Description

GetRange (ContentObject first, ContentObject last)

Gets a range that extends from the 'first' content object to the 'last'

GetRange(Marker start, Marker end)

Gets a range providing a fine-grained control over the range's bounds, e.g. GetRange(first.End, last.Start). For more information, see GcWord API Reference.

To clear all content in a range use the Range.Clear() method. Range, being a collection of ContentObject, allows to enumerate the content elements included in it.

ContentObect

Block and inline elements are derived from the ContentObject class which provides access to the start and end position of an element in a document. Also, it allows to get the parent content element and enumerate the element's children.

In addition, all content objects have the Next and Previous properties which allow to enumerate objects of the same content type through the whole body.

The Delete() method of the ContentObject class removes the element itself and all its inner content from the body.

ContentRange

Reference elements, bookmarks, comments, and complex fields, are slightly different from simple ContentObject.  This kind of elements do not have a parent content since the element can start and end anywhere. For example, it can start in one section and end in another. Instead, reference elements provide a pair of ContentObjects named ContentMark that define the start and end of the element.  The ContentMark has Owner property that points to the ContentRange element. Removing a ContentMark from the body also removes its owner element. The Delete() method on a ContentRange usually removes its ContentMarks only. Complex fields are an exception to this as its actual internal content is also deleted.

Complex Fields

Despite the fact that the complex field inherits from ContentRange, it actually is a combination of ContentRange and ContentObject. Bounds of a complex field are defined by special field characters (see the FieldChar class and the associated enum that defines the type of the field character as Begin, Separator or End values). The complex field can contain two ranges, code range and result range, separated by a Separator field character.

The code range usually contains one or several codes (see FieldCode class) that in turn contain instructions on how to calculate the field's result. The result range contains cached result of the instructions. In the current version, GcWord does not yet calculate instructions, so it does not update the result.

As mentioned above, unlike other ContentRange elements, the Delete() method on a complex field removes not only the field characters from the body but the field codes and the result too.

Section

Sections can only be present in the main body, and any document must have at least one section.

Sections allow to change page formatting for the document parts; PageSetup property and headers or footers collections of a section provide the means to do that. Each section can have its own headers or footers and page formatting.

Headers and footers display on each page of the section and they have their own bodies to store their content. There are several types of headers or footers in a section (see HeaderFooterType enum) and each header or footer can be linked to the same type from a previous section, so you do not have to create identical headers or footers for each section.

Run

A run is a contiguous fragment of a body content with uniform formatting. So, a run is the primary means to change character formatting. It is also a container for all other inline elements (excluding simple fields and hyperlinks).

Nesting elements

The top elements in the main body are the sections. For other body types, the top elements can be paragraphs, tables and content marks (see ContentRange).

Usually elements with the same type cannot be nested (for example, a Run cannot be nested within another Run). Only SimpleField and Hyperlink can be nested. Also, a cell in a table can contain another table within its own cells.

Styles

Styles is the main means allowing to apply formatting to a document's content. GcWord provides 375 built-in styles. There are different style types (see StyleType enumeration). Each type of style can be applied only to the corresponding content type. You can get any built-in type using BuiltInStyleId enumeration.

The StyleCollection class has default styles which can be fetched or set using its GetDefaultStyle(StyleType) or SetDefaultStyle(StyleType, Style) methods. These styles are applied to content that does not have an explicitly specified style. StyleCollection provides the DefaultFont and DefaultParagraphFormat properties which are used by default for the default styles.

Some styles are linked. A linked style is a grouping of a paragraph style and character style which is used in a user interface to allow the same set of formatting properties. For example, if you want to apply Heading 1 paragraph style to a run, you can apply it using Document.Styles[BuiltInStyleId.Heading1].LinkStyle.

Formatting inheritance

GcWord allows to get the actual formatting values of elements. It takes into account the formatting inheritance from default document formatting, base style formatting, applied style formatting, parent content formatting and direct formatting of the element.

ListTemplates

GcWord provides 21 built-in list templates to create lists in the document. The formatting of these templates is the same as in Microsoft Word built-in list templates. There is no "list" class in GcWord. To create a list you need to set ListFormat.Template and ListFormat.LevelNumber (for multilevel lists) properties on each paragraph that should be in the list.

Settings

The Settings class allows to set properties that apply to the whole document, add custom document properties, control document variables, detect and remove document macros, and change view options.