Fossil

Check-in [7f2689b1]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Use tables throughout globs.md
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | annotation-enhancements
Files: files | file ages | folders
SHA3-256:7f2689b1b5411c59a3cb1b52c0175262582962962e84c0ac9cf2ea1230e70d79
User & Date: andygoth 2017-09-24 04:42:57
Context
2017-09-24
09:39
Remove unused local variables that caused breakages with -Werror. To do: This branch still gives a very different answer than trunk for /annotate?limit=-1&checkin=8e27a5a0&filename=src/login.c&log=1 check-in: 120ff0b8 user: drh tags: annotation-enhancements
04:42
Use tables throughout globs.md check-in: 7f2689b1 user: andygoth tags: annotation-enhancements
04:41
Correct name of extras command check-in: 4546d93d user: andygoth tags: annotation-enhancements
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to www/globs.md.

46
47
48
49
50
51
52


53
54
55
56
57
58
59
60
61
62
63
..
75
76
77
78
79
80
81


82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
...
125
126
127
128
129
130
131
132

133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165


166
167
168
169
170
171
172
...
189
190
191
192
193
194
195

196
197
198
199
200
201
202
...
209
210
211
212
213
214
215

216
217
218
219
220
221
222
...
515
516
517
518
519
520
521
522

523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
Ordinary characters consume a single character of the target and must
match it exactly.

Special characters (and special character sequences) consume zero or
more characters from the target and describe what matches. The special
characters (and sequences) are:



 *  `*` Matches any sequence of zero or more characters;
 *  `?` Matches exactly one character;
 *  `[...]` Matches one character from the enclosed list of characters; and
 *  `[^...]` Matches one character not in the enclosed list.

Special character sequences have some additional features:

 *  A range of characters may be specified with `-`, so `[a-d]` matches
    exactly the same characters as `[abcd]`. Ranges reflect Unicode
    code points without any locale-specific collation sequence.
 *  Include `-` in a list by placing it last, just before the `]`.
................................................................................
 *  Note that unlike typical Unix shell globs, wildcards (`*`, `?`,
    and character lists) are allowed to match `/` directory
    separators as well as the initial `.` in the name of a hidden
    file or directory.

Some examples of character lists:



 *  `[a-d]` Matches any one of `a`, `b`, `c`, or `d` but not `ä`;
 *  `[^a-d]` Matches exactly one character other than `a`, `b`, `c`,
    or `d`;
 *  `[0-9a-fA-F]` Matches exactly one hexadecimal digit;
 *  `[a-]` Matches either `a` or `-`;
 *  `[][]` Matches either `]` or `[`;
 *  `[^]]` Matches exactly one character other than `]`;
 *  `[]^]` Matches either `]` or `^`; and
 *  `[^-]` Matches exactly one character other than `-`.

White space means the specific ASCII characters TAB, LF, VT, FF, CR,
and SPACE.  Note that this does not include any of the many additional
spacing characters available in Unicode, and specifically does not
include U+00A0 NO-BREAK SPACE.

Because both LF and CR are white space and leading and trailing spaces
................................................................................
not be a surprise on Unix where all file names are also case
sensitive. However, most Windows file systems are case preserving and
case insensitive. That is, on Windows, the names `ReadMe` and `README`
are names of the same file; on Unix they are different files.

Some example cases:

 *  The glob `README` matches only a file named `README` in the root of

    the tree. It does not match a file named `src/README` because it
    does not include any characters that consume (and match) the
    `src/` part.
 *  The glob `*/README` does match `src/README`. Unlike Unix file
    globs, it also matches `src/library/README`. However it does not
    match the file `README` in the root of the tree.
 *  The glob `*README` does match `src/README` as well as the file
    `README` in the root of the tree as well as `foo/bar/README` or
    any other file named `README` in the tree. However, it also
    matches `A-DIFFERENT-README` and `src/DO-NOT-README`, or any other
    file whose name ends with `README`.
 *  The glob `src/README` does match the file named `src\README` on
    Windows because all directory separators are rewritten as `/` in
    the canonical name before the glob is matched. This makes it much
    easier to write globs that work on both Unix and Windows.
 *  The glob `*.[ch]` matches every C source or header file in the
    tree at the root or at any depth. Again, this is (deliberately)
    different from Unix file globs and Windows wild cards.


## Where Globs are Used

### Settings that are Globs

These settings are all lists of glob patterns:

 *  `binary-glob`
 *  `clean-glob`
 *  `crlf-glob`
 *  `crnl-glob`
 *  `encoding-glob`
 *  `ignore-glob`
 *  `keep-glob`



All may be [versioned, local, or global](settings.wiki). Use `fossil
settings` to manage local and global settings, or a file in the
repository's `.fossil-settings/` folder at the root of the tree named
for each for versioned setting.

Using versioned settings for these not only has the advantage that
................................................................................
usually named to correspond to the setting they override, such as
`--ignore` to override the `ignore-glob` setting. These commands are:

 *  [`add`][]
 *  [`addremove`][]
 *  [`changes`][]
 *  [`clean`][]

 *  [`extras`][]
 *  [`merge`][]
 *  [`settings`][]
 *  [`status`][]
 *  [`unset`][]

The commands [`tarball`][] and [`zip`][] produce compressed archives of a
................................................................................
files to serve with static content where a list of glob patterns
specifies what content may be served.

[`add`]: /help?cmd=add
[`addremove`]: /help?cmd=addremove
[`changes`]: /help?cmd=changes
[`clean`]: /help?cmd=clean

[`extras`]: /help?cmd=extras
[`merge`]: /help?cmd=merge
[`settings`]: /help?cmd=settings
[`status`]: /help?cmd=status
[`unset`]: /help?cmd=unset

[`tarball`]: /help?cmd=tarball
................................................................................
a glob pattern. Find commands and pages in the fossil sources by
looking for comments like `COMMAND: add` or `WEBPAGE: timeline` in
front of the function that implements the command or page in files
`src/*.c`. (Fossil's build system creates the tables used to dispatch
commands at build time by searching the sources for those comments.) A
few starting points:

 *  [`src/glob.c`][glob.c] implements glob pattern list loading,

    parsing, and matching.
 *  [`src/file.c`][file.c] implements various kinds of canonical
    names of a file.


[glob.c]: https://www.fossil-scm.org/index.html/file/src/glob.c
[file.c]: https://www.fossil-scm.org/index.html/file/src/file.c

The actual pattern matching is implemented in SQL, so the
documentation for `GLOB` and the other string matching operators in
[SQLite] (https://sqlite.org/lang_expr.html#like) is useful. Of
course, the SQLite source code and test harnesses also make
entertaining reading:

 *  `src/func.c` [lines 570-768]
    (https://www.sqlite.org/src/artifact?name=9d52522cc8ae7f5c&ln=570-768)
 *  `test/expr.test` [lines 586-673]
    (https://www.sqlite.org/src/artifact?name=66a2c9ac34f74f03&ln=586-673)







>
>
|
|
|
|







 







>
>
|
|
<
|
|
|
|
|
|







 







|
>
|
<
<
<
|
<
<
<
<
|
<
<
|
<
<
<
|
<
<







|
|
|
|
|
|
|
>
>







 







>







 







>







 







|
>
|
|
<

<
|
|




|
|
|
|
|
<
<
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
..
77
78
79
80
81
82
83
84
85
86
87

88
89
90
91
92
93
94
95
96
97
98
99
100
...
128
129
130
131
132
133
134
135
136
137



138




139


140



141


142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
...
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
...
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
...
509
510
511
512
513
514
515
516
517
518
519

520

521
522
523
524
525
526
527
528
529
530
531


Ordinary characters consume a single character of the target and must
match it exactly.

Special characters (and special character sequences) consume zero or
more characters from the target and describe what matches. The special
characters (and sequences) are:

:Pattern |:Effect
---------------------------------------------------------------------
`*`      | Matches any sequence of zero or more characters
`?`      | Matches exactly one character
`[...]`  | Matches one character from the enclosed list of characters
`[^...]` | Matches one character not in the enclosed list

Special character sequences have some additional features:

 *  A range of characters may be specified with `-`, so `[a-d]` matches
    exactly the same characters as `[abcd]`. Ranges reflect Unicode
    code points without any locale-specific collation sequence.
 *  Include `-` in a list by placing it last, just before the `]`.
................................................................................
 *  Note that unlike typical Unix shell globs, wildcards (`*`, `?`,
    and character lists) are allowed to match `/` directory
    separators as well as the initial `.` in the name of a hidden
    file or directory.

Some examples of character lists:

:Pattern |:Effect
---------------------------------------------------------------------
`[a-d]`  | Matches any one of `a`, `b`, `c`, or `d` but not `ä`
`[^a-d]` | Matches exactly one character other than `a`, `b`, `c`, or `d`

`[0-9a-fA-F]` | Matches exactly one hexadecimal digit
`[a-]`   | Matches either `a` or `-`
`[][]`   | Matches either `]` or `[`
`[^]]`   | Matches exactly one character other than `]`
`[]^]`   | Matches either `]` or `^`
`[^-]`   | Matches exactly one character other than `-`

White space means the specific ASCII characters TAB, LF, VT, FF, CR,
and SPACE.  Note that this does not include any of the many additional
spacing characters available in Unicode, and specifically does not
include U+00A0 NO-BREAK SPACE.

Because both LF and CR are white space and leading and trailing spaces
................................................................................
not be a surprise on Unix where all file names are also case
sensitive. However, most Windows file systems are case preserving and
case insensitive. That is, on Windows, the names `ReadMe` and `README`
are names of the same file; on Unix they are different files.

Some example cases:

:Pattern     |:Effect
--------------------------------------------------------------------------------
`README`     | Matches only a file named `README` in the root of the tree. It does not match a file named `src/README` because it does not include any characters that consume (and match) the `src/` part.



`*/README`   | Matches `src/README`. Unlike Unix file globs, it also matches `src/library/README`. However it does not match the file `README` in the root of the tree.




`*README`    | Matches `src/README` as well as the file `README` in the root of the tree as well as `foo/bar/README` or any other file named `README` in the tree. However, it also matches `A-DIFFERENT-README` and `src/DO-NOT-README`, or any other file whose name ends with `README`.


`src/README` | Matches `src\README` on Windows because all directory separators are rewritten as `/` in the canonical name before the glob is matched. This makes it much easier to write globs that work on both Unix and Windows.



`*.[ch]`     | Matches every C source or header file in the tree at the root or at any depth. Again, this is (deliberately) different from Unix file globs and Windows wild cards.



## Where Globs are Used

### Settings that are Globs

These settings are all lists of glob patterns:

:Setting        |:Description
--------------------------------------------------------------------------------
`binary-glob`   | Files that should be treated as binary files for committing and merging purposes
`clean-glob`    | Files that the [`clean`][] command will delete without prompting or allowing undo
`crlf-glob`     | Files in which it is okay to have `CR`, `CR`+`LF` or mixed line endings.  Set to "`*`" to disable CR+LF checking
`crnl-glob`     | Alias for the `crlf-glob` setting
`encoding-glob` | Files that the [`commit`][] command will ignore when issuing warnings about text files that may use another encoding than ASCII or UTF-8.  Set to "`*`" to disable encoding checking
`ignore-glob`   | Files that the [`add`][], [`addremove`][], [`clean`][], and [`extras`][] commands will ignore
`keep-glob`     | Files that the [`clean`][] command will keep

All may be [versioned, local, or global](settings.wiki). Use `fossil
settings` to manage local and global settings, or a file in the
repository's `.fossil-settings/` folder at the root of the tree named
for each for versioned setting.

Using versioned settings for these not only has the advantage that
................................................................................
usually named to correspond to the setting they override, such as
`--ignore` to override the `ignore-glob` setting. These commands are:

 *  [`add`][]
 *  [`addremove`][]
 *  [`changes`][]
 *  [`clean`][]
 *  [`commit`][]
 *  [`extras`][]
 *  [`merge`][]
 *  [`settings`][]
 *  [`status`][]
 *  [`unset`][]

The commands [`tarball`][] and [`zip`][] produce compressed archives of a
................................................................................
files to serve with static content where a list of glob patterns
specifies what content may be served.

[`add`]: /help?cmd=add
[`addremove`]: /help?cmd=addremove
[`changes`]: /help?cmd=changes
[`clean`]: /help?cmd=clean
[`commit`]: /help?cmd=commit
[`extras`]: /help?cmd=extras
[`merge`]: /help?cmd=merge
[`settings`]: /help?cmd=settings
[`status`]: /help?cmd=status
[`unset`]: /help?cmd=unset

[`tarball`]: /help?cmd=tarball
................................................................................
a glob pattern. Find commands and pages in the fossil sources by
looking for comments like `COMMAND: add` or `WEBPAGE: timeline` in
front of the function that implements the command or page in files
`src/*.c`. (Fossil's build system creates the tables used to dispatch
commands at build time by searching the sources for those comments.) A
few starting points:

:File            |:Description
--------------------------------------------------------------------------------
[`src/glob.c`][] | Implementation of glob pattern list loading, parsing, and matching.
[`src/file.c`][] | Implementation of various kinds of canonical names of a file.



[`src/glob.c`]: https://www.fossil-scm.org/index.html/file/src/glob.c
[`src/file.c`]: https://www.fossil-scm.org/index.html/file/src/file.c

The actual pattern matching is implemented in SQL, so the
documentation for `GLOB` and the other string matching operators in
[SQLite] (https://sqlite.org/lang_expr.html#like) is useful. Of
course, the SQLite [source code]
(https://www.sqlite.org/src/artifact?name=9d52522cc8ae7f5c&ln=570-768)
and [test harnesses]
(https://www.sqlite.org/src/artifact?name=66a2c9ac34f74f03&ln=586-673)
also make entertaining reading.