před 3 roky · 7271e1a8bc
--- a/latex/result/thesis.pdf
+++ b/latex/result/thesis.pdf
--- a/latex/tex/kapitel/k4_algorithms.tex
+++ b/latex/tex/kapitel/k4_algorithms.tex
@@ -169,16 +169,16 @@ In order, to remain in a presentable range, the example in \ref{k4:arith-unscale
 
				     \toprule
			
 
				      \textbf{Symbol} & \textbf{Probability} & \textbf{Interval}\\
			
 
				     \midrule
			
 
				-			A & $\frac{2}{4}=0.11$ & ${x\in \mathbb{Q} | 0.0 <= x < 0.5}$\\
			
 
				-			C & $\frac{1}{4}=0.71$ & ${x\in \mathbb{Q} | 0.5 <= x < 0.75}$\\
			
 
				-			G & $\frac{1}{4}=0.13$ & ${x\in \mathbb{Q} | 0.75 <= x < 1.0}$\\
			
 
				+			A & $\frac{2}{4}=0.11$ & [0.0, 0.5[ \\ %${x\in \mathbb{Q} | 0.0 <= x < 0.5}$\\
			
 
				+			C & $\frac{1}{4}=0.71$ & [0.5, 0.75[ \\ %${x\in \mathbb{Q} | 0.5 <= x < 0.75}$\\
			
 
				+			G & $\frac{1}{4}=0.13$ & [0.75, 1.0[ \\ %${x\in \mathbb{Q} | 0.75 <= x < 1.0}$\\
			
 
				     \bottomrule
			
 
				   \end{longtable}
			
 
				 \end{footnotesize}
			
 
				 \rmfamily
			
 
				 
			
 
				-In the encoding process, the first symbol read from the sequence determines a interval, its symbol is associated with. Every following symbol determines a subinterval, which is determined by subdividing the previous interval into sections proportional to the probabilities from \ref{t:arith-prob}.
			
 
				-Starting with \texttt{A}, the most left interval in \ref{k4:arith-unscaled} is subdivided into intervals visulaized below. Leaving a available space of $[0.0, 0.5)$. From there the interval, representing \texttt{G} is subdivided, and so on until the last symbol \texttt{C} is processed. This leaves a interval of $[0.40625, 0.421275)$.\\
			
 
				+In the encoding process, the first symbol read from the sequence determines a interval, its symbol is associated with. Every following symbol determines a subinterval, which is formed by subdividing the previous interval into sections proportional to the probabilities from \ref{t:arith-prob}.
			
 
				+Starting with \texttt{A}, the most left interval in \ref{k4:arith-unscaled} is subdivided into intervals visulaized below. Leaving a available space of $[0.0, 0.5)$. From there the interval, representing \texttt{G} is subdivided, and so on until the last symbol \texttt{C} is processed. This leaves a interval of $[0.40625, 0.421275)$. This is marked in \ref{k4:arith-unscaled} with a red line. Since the interval is comparably small, in the illustration it seems like a point in the interval is marked. This is not the case, the red line shows the position of the last mentioned interval.\\
			
 
				 %To encode a text, subdividing is used, step by step on the text symbols from start to the end
			
 
				 To store the encoding result in as few bits as possible, only a single number,between upper and lower end of the last intervall will be stored. To encode in binary, the binary floating point representation of any number inside the interval, for the last character is calculated.\\
			
 
				 For this example, the number \texttt{0.41484375} in decimal, or \texttt{0.0110101} in binary, would be calculated.\\
			
@@ -202,11 +202,11 @@ For the decoding process to work, the \ac{EOF} symbol must be be present as the
 
				   %\includegraphics[width=15cm]{k4/arith-resize.png}
			
 
				   \includegraphics[width=15cm]{k4/arith-scaled.png}
			
 
				   \caption{Illustrative rescaling in arithmetic coding process.}
			
 
				-  \label{k4:rescale}
			
 
				+  \label{k4:arith-scaled}
			
 
				 \end{figure}
			
 
				 
			
 
				 % finite percission
			
 
				-The described coding is only feasible on machines with infinite percission \cite{witten87}. As soon as finite precission comes into play, the algorithm must be extendet, so that a certain length in the resulting number will not be exceeded. Since digital datatypes are limited in their capacity, like unsigned 64-bit integers which can store up to $2^64-1$ bit or any number between 0 and 18,446,744,073,709,551,615. That might seem like a great ammount at first, but considering a unfavorable alphabet, that extends the results lenght by one on each symbol that is read, only texts with the length of 63 can be encoded (62 if \acs{EOF} is exclued) \cite{moffat_arith}.
			
 
				+The described coding is only feasible on machines with infinite percission \cite{witten87}. As soon as finite precission comes into play, the algorithm must be extendet, so that a certain length in the resulting number will not be exceeded. Since digital datatypes are limited in their capacity, like unsigned 64-bit integers which can store up to $2^64-1$ bit or any number between 0 and 18,446,744,073,709,551,615. That might seem like a great ammount at first, but considering a unfavorable alphabet, that extends the results lenght by one on each symbol that is read, only texts with the length of 63 can be encoded (62 if \acs{EOF} is exclued) \cite{moffat_arith}. For the compression with finite percission, rescaling is used. This method works by scaling up the intervals which results from subdividing. With that. The process for this is illustrated in \ref{k4:arith-scaled}. The red lines indicate the final interval.
			
 
				 
			
 
				 \label{k4:huff}
			
 
				 \subsection{Huffman encoding}