Mike Gerwitz

Activist for User Freedom

diff options
authorMike Gerwitz <gerwitzm@lovullo.com>2014-05-08 14:22:01 -0400
committerMike Gerwitz <gerwitzm@lovullo.com>2014-05-08 14:22:01 -0400
commitf2ca2885d7fb41bbc5e1c683298977d9d2a72373 (patch)
tree86f02343d8637c40c7af9c1f817f1358193ec81f /raterspec
parentfc1c3ad7df2250a30c24655511dd7dba7f4cd850 (diff)
Extracted hostenv, input, and params chapters from dwelling specs
Diffstat (limited to 'raterspec')
3 files changed, 365 insertions, 0 deletions
diff --git a/raterspec/sec/hostenv.tex b/raterspec/sec/hostenv.tex
new file mode 100644
index 0000000..078d383
--- /dev/null
+++ b/raterspec/sec/hostenv.tex
@@ -0,0 +1,238 @@
+% Host Environment
+This specification defines a rater designed for integration with LoVullo's Quote
+\section{Quote Server Integration}
+\index{quote server|(}
+\incomplete The implementation \must satisfy at least one of the following
+ \item
+ The implementation \shall be written in, or compile to, ^JavaScript and shall
+ provide a \func{rate} function, which \shall itself exchange data with the
+ Quote Server in an implementation-defined manner.\footnote{There is currently
+ no specification for the Quote Server; one is needed.}
+ \item
+ The implementation \shall be written in, or compile to, ^PHP and shall
+ integrate with the \index{REST}RESTful interface provided by the ^[LoVullo
+ website] to provide an interface to the Quote Server.
+The implementation \shall return all data suitable for immediate storage in the
+\index{quote server|)}
+\section{Host Architecture and Operating System}
+The implementation \shallnot make any assumptions regarding the host
+architecture, except for those that are enumerated in this section. Furthermore,
+the implementation \shall ensure that the programming language in which the
+implementation is written does not violate the requirements of this section, but
+\shallnot be responsible for any assumptions made by the programming language
+that are not contradicted within this section.\footnote{This section is
+intentionally small, as the high-level languages used in the implementation
+implementation do not rely on many architectural details.}
+\subsection{Word Size}
+\p{word-size} The host machine ^[word size] \shall be at least 32-bits
+long;\footnote{This word size is limited by the architectures of developer PCs
+and the PCs of users who may be testing the implementation outside of its
+integration with the Quote Server.} implementation \shallnot depend on the
+availability of a larger word size unless such an implementation cannot affect
+the result of any calculation.
+The requirement of \pref{word-size} does not apply to ^floating-point datatypes,
+which \may use the full 80~bits permitted by the lower bound of the
+^[floating-point!IEEE 754] ^[floating-point!double-extended] precision format.
+\subsection{Memory Requirements}
+The implementation \shall have at least \minmem\ of memory available
+exclusively for its use.
+The implementation \may use additional memory if available, but \shallnot fail
+to operate within the requirements of this specification if additional memory is
+not available. If an implementation does not explicitly allocate its own memory
+(e.g. is a dynamically allocated and garbage-collected language) and the
+implementation will automatically fail\footnotemark\ if the memory limit is
+exceeded (a hard memory limit), then the host environment \shall provide at
+least $32$~extra megabytes of memory in addition to the \minmem, allowing the
+implementation to query its memory usage and determine when the minimum memory
+limit has been exceeded.
+\footnotetext{An example of such a language is ^[PHP]. Other languages, such as
+JavaScript, will attempt to garbage collect rather than immediately dying.}
+It is the responsibility of the host environment---not the implementation---to
+limit memory consumption.
+Should the implementation be unable to allocate additional memory before reaching
+\minmem, and such a failure prohibits successful execution of all remaining
+applicable calculations, then the implementation \shall fail with an unspecified
+error; the remaining execution path after such a failure is \undefined.
+The implementation \shallnot depend on any reasonable scheduling expectations
+and \shall thus continue operation until explicitly interrupted in any
+unspecified manner.
+\section{Numeric Datatypes}
+An implementation \shallnot use ^fixed-point or ^[binary-coded decimal] (BCD)
+representations of numbers; ^floating-point \shall be used
+instead.\footnote{This restriction exists simply because LoVullo does not make
+use of these other formats.}
+All floating-point values \shall be represented as a single-precision floating
+point value as defined by ^[floating-point!IEEE 754]'s ^[floating-point!IEEE
+754!binary32] format. This format as a base of $2$; a 24-bit significand (with
+the highest-order bit implied); an 8-bit exponent; and a single sign bit.
+Whether higher-precision values are truncated or rounded in order to fit into a
+single-precision format is unspecified; the rounding mode is also unspecified.
+All other scalar datatypes that are not ^floating-point \shall be able to be
+represented by a signed, two's compliment, 32-bit integer representation.
+Unsigned integer types \shallnot be used as the result of any calculation, but
+may be used to hold intermediate results, so long as those intermediate values
+are not returned as the result of another calculation. Intermediate results
+returned only for debugging purposes are exempt from this requirement and their
+limits are unspecified, so long as their values are not returned to the Quote
+Server.\footnote{These restrictions exist to cope with implicit numeric
+representations of various systems and languages.} Unsigned integer results that
+can be represented exactly as a ^floating-point value are too exempt from this
+restriction, so long as the return value indicates that the data type is
+\subsection{Floating Point Arithmetic}
+This section applies only to implementations that make use of ^floating-point
+arithmetic. Floating point is so-called because the radix point does not have a
+fixed position.
+A ^[floating-point!IEEE 754!binary32] ^floating-point value may be computed as
+ (-1)^s \left(1+\sum\limits_{i=1}^{23} c_{23-i} 2^{-i}\right)
+ 2^{e-127},
+where $s$ is the value of the sign bit, $c_i$ represents the base-2 significand
+(``coefficient'') digit $i$, and $e$ represents the exponent. The added $1$ in
+the calculation is to account for the implicit higher-order bit in the
+significand. The $-127$ offset at the end of the equation is the exponent
+bias.\footnote{While it is unlikely that an implementation will have to
+explicitly compute the value of a binary ^floating-point representation, the
+knowledge is very useful for understanding issues that arise from
+^floating-point arithmetic.}
+\subsubsection{Handling of Precision Loss}
+Rounding errors are particularly problematic with the use of \func{floor} and
+\func{ceil} functions, in which the smallest precision error can drastically
+affect the result of a calculation.
+The implementation \shallnot assume the availability of access to hardware
+^floating-point ^[floating-point!status registers]---or any equivalent software
+representation---that indicates loss of precision due to truncation or rounding
+errors.\footnote{Such flags would be ideal, but are inaccessible from both
+JavaScript and ^[PHP]; we must cater to the lowest common denominator.}
+The implementation \may employ the common method of performing intermediate
+calculations in a higher-precision format (higher than single-precision) before
+storing the result in a single-precision ^floating-point format.\footnote{The
+rationale behind this decision is that all systems in this office use x86
+processors, which support 80-bit ^[floating-point!double-extended] precision
+^floating-point registers.}
+To avoid problems inherent with ^floating-point arithmetic, an implementation
+\may (and is encouraged to) use integer arithmetic, so long as the results can
+be sufficiently represented as a signed 32-bit integer type; the integer value
+may then be converted back into a ^floating-point type. Since a single-precision
+significand is $23$ bits in length, excluding the implicit high-order bit, this
+therefore means that an integer value of $-2^{23} \leq n \leq 2^{23}-1$ can be
+converted between the two types without loss of data, yielding a range of
+$-8388608 \leq n \leq 8388607$.
+ Let~$a=0.60$, $b=0.30$ and~$c=0.10$, where~$a$, $b$ and~$c$ are
+ single-precision ^floating-point numbers (^[floating-point!IEEE
+ 754!binary32]). Consider that we wish to apply these variables to the
+ seemingly innocuous calculation
+ $${\left\lfloor 100\left(a+b+c\right) \right\rfloor \over100},$$
+ which will round up to the nearest penny. Intuitively (and mathematically), we
+ would expect that
+ $$
+ {\left\lfloor 100\left(0.60+0.30+0.10\right) \right\rfloor \over100}
+ = {\left\lfloor 100\left(1.00\right) \right\rfloor \over100}
+ = {\left\lfloor 100 \right\rfloor \over100}
+ = 1.00.
+ $$
+ However, none of the values of~$a$, $b$ or~$c$ can be represented exactly in
+ any binary ^floating-point format due to how it is converted from decimal.
+ Therefore (leaving the details aside), depending on how the implementation
+ performs its arithmetic, we could end up with something like this:
+ $$
+ {\left\lfloor 100\left(0.60+0.30+0.10\right) \right\rfloor \over100}
+ = {\left\lfloor 100\left(0.99\overline{9}\right) \right\rfloor \over100}
+ = {\left\lfloor 99.9\overline{9} \right\rfloor \over100}
+ = {99 \over100}
+ = 0.99.\footnotemark
+ $$\footnotetext{This can be seen in ^PHP with \code{var\_dump( floor(
+ 100 * (0.60+0.30+0.10) ) / 100 )}, which yields the value $0.99$. Using the
+ v8 ^JavaScript engine, the same result of $0.99$ is obtained with
+ \code{Math.floor( 100 * (0.60+0.30+0.10) ) / 100}.}
+ While this specific equation may not seem too bad---resulting in only a
+ $0.01$ difference from the expected value, which is still very bad in
+ financial systems---consider what would have happened if we had simply taken
+ the floor; this would have resulted in a $1.00$ difference.
+ Alternatively, we could let~$a=60$, $b=30$ and~$c=10$, add the values using
+ integer arithmetic which will always yield the value~$100$, and then use that
+ value in the equation, eliminating the floating-point precision loss. Since
+ we only care about values up to two decimal places here, such a conversion
+ could be done by casting the value of $100n$ to an integer, which would
+ truncate at the radix point, thereby removing the least inaccurate
+ bits.\footnote{In v8, \code{(0.6*100)+(0.3*100)+(0.1*100)} yields $100$; the
+ same is true for ^[PHP].}
+It is likely that an implementation does not need the full 24-bits of the
+significand. When the erroneous portion of a floating-point value is entirely
+contained within the low-order bits of the significand, an implementation \may
+truncate or round in an unspecified manner the value to the desired level of
+precision (if supported by the source language). This method may be useful in
+preventing rounding errors when using \func{floor} and~\func{ceil} functions.
+ In IEEE 754 ^[floating-point!IEEE 754!binary32] floating-point, $0.1-0.01
+ \approx 0.09000000000000001$. This will cause problems if we try to round to
+ the nearest penny, as it would then produce $0.10$ instead of the intended
+ $0.09$. This problem may be eliminated by truncating the value to two decimal
+ places before rounding, if supported by the source language.\footnote{In
+ ^[JavaScript], one could use the convoluted syntax
+ \code{+(0.1-0.01).toPrecision(1)} to yield $0.09$. In ^[PHP], \func{round} will
+ produce the closest floating-point approximation, which could potentially
+ introduce other problems, but happens to work in this case.}
+ One sure way of truncating a value is to store the portion of the significand
+ desired into an integer. For example, we could obtain $0.09$ by multiplying by
+ $100$, perform integer arithmetic and then divide by $100$ once we are done.
diff --git a/raterspec/sec/input.tex b/raterspec/sec/input.tex
new file mode 100644
index 0000000..52a1510
--- /dev/null
+++ b/raterspec/sec/input.tex
@@ -0,0 +1,84 @@
+% Input Data
+The implementation \shall accept input in a manner described by
+\p{indata} The implementation \shall accept dynamic input data as an
+associative array of values whereby the key represents the name of the input
+parameter and its associated value is a vector of (a)~character strings or
+(b)~an arbitrarily deep array of character strings. An implementation \shall
+support at least a vector of character strings to satisfy condition~(b), but
+\may choose to support a greater depth for implementation-specific data.
+\p{indata-vscalar} When a scalar is expected, an implementation \shall also
+accept a vector $v$, but \shall recognize only the first index $v_0$ of the
+vector and \shall act as though $v_0$ was the provided scalar.
+An implementation \shall implicitly return all undefined parameter requests as
+the scalar floating-point value $0.00$. This includes undefined indexes on
+defined parameters as well as undefined indexes on undefined parameters.
+\p{determ} The implementation \shall be ^[deterministic]---that is, its
+operation \shall be controlled exclusively by its input data and not by external
+state such as the current date or time of day, unless such data are passed as
+input data as defined by \pref{indata}.
+If an implementation relies on a ^[input data!third-party service] for a
+calculation~$c$, then the implementation is \exempt from the requirements of
+\pref{determ} only so far as is necessary to retrieve the data from the
+third-party service and populate~$c$; no other exemptions are permitted.
+If multiple arguments are provided for the same ^parameter name, the
+implementation \must fail in error, unless an implementation cannot be aware of
+such a condition.\footnote{As an example, a ^JavaScript object literal with
+duplicate fields does not throw an error, but instead silently drops
+\index{parameter type|(}
+\section{Parameter Type Definitions}
+An implementation \may use alternative names for these types, or neglect to
+implement the types at all, so long as the output and error conditions of the
+implementation cannot be distinguished in any way from an implementation that
+does implement the parameter types defined in this section.
+ If a parameter type is simply an ^[parameter type!alias] of another type, an
+ implementation may decide to use the corresponding base type. If a parameter
+ type is simply a restriction on its base type, an implementation may apply
+ such a domain restriction however it sees fit---e.g., a block of conditionals.
+\subsection{Base Parameter Types}
+The parameters defined within this section are not derived from any other
+parameter types and are known as \dfn{parameter type!base parameter types}.
+All other parameter types not defined within this section \shall be derived from
+these base types; such parameters are known as \dfn{parameter type!derived
+parameter types}.
+ \typedef core bool:
+ Boolean---any datatype able to hold a `yes' or `no' value ($1$ or $0$
+ respectively)
+ \typedef core float:
+ Single-precision binary floating-point number (as defined by IEEE 754
+ binary32)
+ \typedef core int:
+ Any signed 32-bit integer
+\index{parameter type|)}
+%% content set with \inputtypes will be outptu at this point
diff --git a/raterspec/sec/params.tex b/raterspec/sec/params.tex
new file mode 100644
index 0000000..73e3b59
--- /dev/null
+++ b/raterspec/sec/params.tex
@@ -0,0 +1,43 @@
+% Input parameters
+An implementation \shall support each parameter defined in this section for the
+purpose of accepting input data.
+An implementation \shallnot ^fail in ^[fail!error] if a parameter is provided
+that is not listed within this section.\footnote{For example, this allows data
+to be used for multiple suppliers.}
+\p{param-fail} An implementation \must ^fail in ^[fail!error] if one or more of
+the parameters defined in this section are not provided, unless the parameter is
+unused in every calculation that applies to the input data. The exact failure
+point is unspecified, but the implementation must not return the value of any
+The implementation may check all required parameters before performing any
+calculations, or may decide to defer parameter checking until the parameter is
+actually used for the first time; the latter would be expected to be more
+performant in the case that the input data is expected to be well-formed most of
+the time, but will have wasted many cycles in computing calculations that will
+be thrown away in event of a failure. In the former case, the upfront validation
+cost is more steep, but no cycles are wasted on calculations in the event of an
+immediate failure.
+An implementation \must ^fail in ^[fail!error] if an ^argument associated with
+its parameter does not fall within the ^domain of the parameter, unless the
+parameter is unused in every calculation that applies to the input data. The
+exact failure point is subject to the same requirements as \pref{param-fail}.
+An implementation \mustnot ^fail in ^[fail!error] if an argument associated with
+its parameter \emph{does} fall within its ^[domain], but \may fail for other
+reasons defined within this specification unrelated to the domain.
+%% content set with \inputparams will be outptu at this point