[README.txt] Regex Crossword Solver version 1.1.0 Contents -------- 01. What is it? 02. Where to get it? 03. Requirements 04. How to build it? 04.a. How to build it with gcc? (Linux, Cygwin) 04.b. How to build it with clang? (Linux, Cygwin) 04.c. How to build it with Visual Studio? (Windows) 04.d. Difference of speed between Release and Debug builds 05. How to install it? 06. How to use it? 07. How to follow the program's progress? 08. How does it work? 09. Known bugs 10. Limitations 10.a. Supported crossword geometries 10.b. Supported regular expressions 11. License 12. Author 13. Contacts 01. What is it? --------------- Regex Crossword Solver is a program which can solve regular expression crosswords, such as the MIT puzzle http://www.mit.edu/~puzzle/2013/coinheist.com/rubik/a_regular_crossword/grid.pdf and some of the ones available at https://regexcrossword.com. 02. Where to get it? -------------------- Regex Crossword Solver is available at: * http://solving-regular-expression-crosswords.blogspot.com (blog) * https://github.com/antoine-trux/regex-crossword-solver (code) 03. Requirements ---------------- Regex Crossword Solver is available only as source code, no binaries are provided, so you need a compiler to build it. Regex Crossword Solver can be built with these compilers: * gcc - in Linux and Cygwin * clang - in Linux and Cygwin * Visual Studio - in Windows I have built and tested Regex Crossword Solver with these combinations: * Linux (64-bit Kubuntu): gcc 4.7.4, gcc 5.2.1, clang 3.6.2 * Cygwin (32-bit): gcc 4.9.3, clang 3.5.2 * Windows 7 (64-bit): Visual Studio Community 2015 (Update 2) Clang has some limitations compared to gcc: * One unit test had to be disabled - search contents of directory 'source/unit_tests' for "__clang__" for details. * I was unable to get Link-Time Optimization to work with clang. As a result, the program is about 7.3% slower than when built with gcc. * Coverage reports are not available. Cygwin and Visual Studio also have limitations: * Valgrind is not available for Cygwin and Windows, so the Valgrind checks do not apply in those environments. * Coverage reports are not available for Visual Studio. That said, the program built with clang or Visual Studio can solve the same crosswords as when built with gcc. The program is written in standard C++11, so in principle any compiler which complies with that standard should work. Note that I was unable to build the program in Visual Studio 2013. For example, that version does not support the C++11 construct 'constexpr' (used in a few places). 04. How to build it? -------------------- Unzip the downloaded file. The instructions then depend on your compiler: 04.a. How to build it with gcc? (Linux, Cygwin) ----------------------------------------------- $ cd /path_to_unzipped_package/source $ make The program should now be: /path_to_unzipped_package/build.g++.release/regex_crossword_solver Then, to check the build: $ make check_without_valgrind If you are patient, you can also check the build with Valgrind (Linux only): $ make check_with_valgrind For more options: $ make help 04.b. How to build it with clang? (Linux, Cygwin) ------------------------------------------------- $ cd /path_to_unzipped_package/source $ make CXX=clang++ The program should now be: /path_to_unzipped_package/build.clang++.release/regex_crossword_solver Then, to check the build: $ make CXX=clang++ check_without_valgrind If you are patient, you can also check the build with Valgrind (Linux only): $ make CXX=clang++ check_with_valgrind For more options: $ make help 04.c. How to build it with Visual Studio? (Windows) --------------------------------------------------- Open Visual Studio solution path_to_unzipped_packageVisualStudio2015regex_crossword_solver.sln In Visual Studio: * change the configuration from "Debug" to "Release", * in menu "Build", select "Build Solution". The program should now be: path_to_unzipped_packageVisualStudio2015regex_crossword_solverreleaseregex_crossword_solver.exe Then, to check the build: * open a Command Prompt * in the Command Prompt: cd path_to_unzipped_packageVisualStudio2015 move run_grid_tests.rename_extension_to_cmd run_grid_tests.cmd run_grid_tests.cmd Note that the unit tests are automatically run when project 'regex_crossword_solver_unit_tests' is built in Visual Studio. 04.d. Difference of speed between Release and Debug builds ---------------------------------------------------------- The Release build is much faster than the Debug build. For example, when built with gcc in Linux, the program solves the MIT puzzle 21 times faster (as measured by Callgrind) when built in Release mode than in Debug mode. Such a large difference can be explained by the following factors: * the Release build uses full optimizations, the Debug build uses none; * the program makes liberal use of assertions - assertions are enabled in the Debug build, disabled in the Release build; * the program's functions are small on average - the Release build inlines small functions, the Debug build does not. 05. How to install it? ---------------------- No installation is needed. The program can be used as such, for example in the directory in which it was built. 06. How to use it? ------------------ Regex Crossword Solver has no GUI. Only a command line interface is provided. Basic usage: [Linux or Cygwin] $ cd /path_to_unzipped_package/build.<compiler>.release $ ./regex_crossword_solver <input file> [Visual Studio] * open a Command Prompt * in the Command Prompt: cd path_to_unzipped_packageVisualStudio2015regex_crossword_solverrelease regex_crossword_solver.exe <input file> For input file examples, see directory /path_to_unzipped_package/grid_tests, which contains many examples. For more advanced usage, start the program with option '--help'. 07. How to follow the program's progress? ----------------------------------------- In order to see how the program works its way to a solution, you can use option '--log'. In order to use this option: * set ENABLE_LOGGING, defined in 'source/logger.hpp', to 1, * rebuild the program. Then, for example: [Linux or Cygwin] $ cd /path_to_unzipped_package/build.<compiler>.release $ ./regex_crossword_solver --log MIT.log ../grid_tests/MIT.input.txt [Visual Studio] * open a Command Prompt * in the Command Prompt: cd path_to_unzipped_packageVisualStudio2015regex_crossword_solverrelease regex_crossword_solver.exe --log MIT.log ......grid_testsMIT.input.txt 08. How does it work? --------------------- Here is a brief overview of how the program works: 1. The command line is parsed. => see module 'command_line' 2. The text input file provided in the command line is parsed. => see module 'grid_reader' 3. The grid is constructed. This involves: a. setting up the geometry of the grid => see modules 'grid', 'rectangular_grid', 'hexagonal_grid', 'grid_line', 'grid_cell' b. building the parsed regexes from the string regexes - for each regex: * the string regex input is split into tokens => see modules 'regex_token' and 'regex_tokenizer' * the tokens are parsed into a regex => see module 'regex_parser' c. setting the alphabet to the explicitly mentioned characters d. initializing the cells to universal constraints (i.e., constraints which contain all the alphabet characters) 4. The regexes are optimized. => see modules 'grid', 'grid_line', 'regex' 5. The grid is solved. => see modules 'grid', 'grid_line', 'regex' The core of the program resides in method Regex::constrain() and its callees. Module 'regex' implements a regex engine specially designed for solving regex crosswords, as I found no regex library suitable for that purpose. 09. Known bugs -------------- No known bugs at the time of this writing. 10. Limitations --------------- 10.a. Supported crossword geometries ------------------------------------ Regex Crossword Solver only supports: * crossword grids of regular hexagonal shape (all sides of equal length), * crossword grids of rectangular shape. Regex Crossword Solver supports one or several regexes per line: * for crossword grids of regular hexagonal shape, the number of regexes must be the same for all the lines, * for crossword grids of rectangular shape, the number of regexes must be the same for all rows, and the same for all columns, but can be different for rows and columns. Crosswords with different numbers of regexes for different lines in a hexagonal grid, or for different rows/columns in a rectangular grid are not supported as such. It is easy, however, to work around this limitation, by adding '.*' (a "universal regex") to the crossword. For example, if the crossword has two rows, one with one regex, the other with two regexes: * row 1 -> regex_1 * row 2 -> regex_1, regex_2 You can add '.*' to row 1: * row 1 -> regex_1, .* * row 2 -> regex_1, regex_2 after which both rows have the same number of regexes. Since version 1.1.0, Regex Crossword Solver optimizes away universal regexes, so no performance penalty results from adding them. 10.b. Supported regular expressions ----------------------------------- There are many regular expression standards, and they differ in small details. I have used Python 3's specification of regular expressions as my reference, because Python is widely used, and because Python's regular expression specification is fairly well documented. The latest version of Python at the time of this writing is Python 3.5, and the regular expression specification for that version is available at https://docs.python.org/3.5/library/re.html In the rare cases where I found the specification was not clear enough, I checked the implementation's behavior, with deprecation warnings enabled: $ python3.5 >>> import warnings >>> warnings.simplefilter('always', DeprecationWarning) and I treated warnings as errors. Here is a (possibly non-exhaustive) list of regular expression features supported by Python but not by Regex Crossword Solver: * {,n} - treated the same as {0,n} by Python * uhhhh - 16-bit Unicode character * Uhhhhhhhh - 32-bit Unicode character * the (? constructs - many puzzles in 'https://regexcrossword.com/playerpuzzles' use these constructs, and are thus not solvable by Regex Crossword Solver * however, the following constructs are supported: * (?: non-capturing group (supported since version 1.1.0) * (?= positive lookahead (supported since version 1.1.0) Here is a (probably non-exhaustive) list of regular expression features which behave differently in Python and in Regex Crossword Solver: * Python supports backreference numbers up to 99, Regex Crossword Solver supports backreference numbers only up to 9. * In Python, 's' means "any whitespace character", in Regex Crossword Solver it only means space. If Regex Crossword Solver treated 's' as any whitespace character, some puzzles would have thousands of solutions instead of one. * Python accepts constructs such as 'a{ 1,2}' (note the space after the opening brace), where this means string "a{ 1,2}", not "a repeated once or twice". Regex Crossword Solver rejects such constructs. * Python does not accept nested repetitions - for example: 'a**'. Regex Crossword Solver does. * Python does not accept regular expressions such as '*'. Regex Crossword Solver does (it interprets it as 'epsilon repeated any number of times'). In my opinion, this is consistent with the fact that regular expressions such as '|' are accepted (both by Python and by Regex Crossword Solver). 11. License ----------- See /path_to_unzipped_package/LICENSE.txt. 12. Author ---------- Antoine Trux 13. Contacts ------------ http://solving-regular-expression-crosswords.blogspot.com or: author's email: firstname.lastname@gmail.com
Regex Crossword Solver is a program which can solve regular expression crosswords, such as the MIT puzzle http://www.mit.edu/~puzzle/2013/coinheist.com/rubik/a_regular_crossword/grid.pdf and some of the ones available at https://regexcrossword.com.
README.txt
Subscribe to:
Posts (Atom)
No comments:
Post a Comment