Idris2Doc : Text.Literate


A simple module to process 'literate' documents.

The module uses a lexer to split the document into code blocks,
delineated by user-defined markers, and code lines that are
indicated be a line marker. The lexer returns a document stripped
of non-code elements but preserving the original document's line
count. Column numbering of code lines are not preserved.

The underlying tokeniser is greedy.

Once it identifies a line marker it reads a prettifying space then
consumes until the end of line. Once identifies a starting code
block marker, the lexer will consume input until the next
identifiable end block is encountered. Any other content is
treated as part of the original document.

Thus, the input literate files *must* be well-formed w.r.t
to code line markers and code blocks.

A further restriction is that literate documents cannot contain
the markers within the document's main text: This will confuse the
recordLiterateError : Type
Totality: total
MkLitErr : Int -> Int -> String -> LiterateError

.column : LiterateError -> Int
.input : LiterateError -> String
.line : LiterateError -> Int
recordLiterateStyle : Type
  Description of literate styles.

A 'literate' style comprises of

+ a list of code block deliminators (`deliminators`);
+ a list of code line markers (`line_markers`); and
+ a list of known file extensions `file_extensions`.

Some example specifications:

+ Bird Style

MkLitStyle Nil [">"] [".lidr"]

+ Literate Haskell (for LaTeX)

MkLitStyle [("\\begin{code}", "\\end{code}"),("\\begin{spec}","\\end{spec}")]
[".lhs", ".tex"]

+ OrgMode

MkLitStyle [("#+BEGIN_SRC idris","#+END_SRC"), ("#+COMMENT idris","#+END_COMMENT")]

+ Common Mark

MkLitStyle [("```idris","```"), ("<!-- idris","--!>")]
[".md", ".markdown"]

Totality: total
MkLitStyle : List (String, String) -> ListString -> ListString -> LiterateStyle

.deliminators : LiterateStyle -> List (String, String)
  The pairs of start and end tags for code blocks.
.file_extensions : LiterateStyle -> ListString
  Recognised file extensions. Not used by the module, but will be
of use when connecting to code that reads in the original source
.line_markers : LiterateStyle -> ListString
  Line markers that indicate a line contains code.
embedCode : LiterateStyle -> String -> String
  Given a 'literate specification' embed the given code using the
literate style provided.

If the style uses deliminators to denote code blocks use the first
pair of deliminators in the style. Otherwise use first linemarker
in the style. If there is **no style** return the presented code
string unembedded.

@specification The literate specification to use.
@code The code to embed,

Totality: total
extractCode : LiterateStyle -> String -> EitherLiterateErrorString
  Given a 'literate specification' extract the code from the
literate source file (`litStr`) that follows the presented style.

@specification The literate specification to use.
@litStr The literate source file.

Returns a `LiterateError` if the literate file contains malformed
code blocks or code lines.

Totality: total
isLiterateLine : LiterateStyle -> String -> (MaybeString, String)
  Is the provided line marked up using a line marker?

If the line is suffixed by any one of the style's set of line
markers then return length of literate line marker, and the code stripped from the line
marker. Otherwise, return Nothing and the unmarked line.

Totality: total
relit : LiterateStyle -> String -> String
  Synonm for `embedCode`

Totality: total
unlit : LiterateStyle -> String -> EitherLiterateErrorString
  Synonym for `extractCode`.

Totality: total