<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>KOMA.si, stari ... &#187; UTF-8</title>
	<atom:link href="http://www.koma.si/tag/utf-8/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.koma.si</link>
	<description>... pa kak&#039; te tega ne razumeš?</description>
	<lastBuildDate>Mon, 08 Aug 2011 00:40:38 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>RA: N03: UTF-8</title>
		<link>http://www.koma.si/2010/03/ra-n03-utf-8/</link>
		<comments>http://www.koma.si/2010/03/ra-n03-utf-8/#comments</comments>
		<pubDate>Sat, 27 Mar 2010 18:00:58 +0000</pubDate>
		<dc:creator>Ali Gator</dc:creator>
				<category><![CDATA[RA]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[Unicode]]></category>
		<category><![CDATA[UTF-8]]></category>

		<guid isPermaLink="false">http://www.koma.si/?p=92</guid>
		<description><![CDATA[<h3>Navodilo:</h3>
Napišite program (npr. v C++), ki bo kodiral in dekodiral znake abecede Unicode v kodiranju <a href="http://en.wikipedia.org/wiki/Utf-8">UTF-8</a>.

Znakov iz Unicode, ki jih <a href="http://tools.ietf.org/html/rfc3629">UTF-8</a> kodira, je 1114111, tj. 10FFFF (hex), oz. max. 128 bitov. Pri tem kodi 0xFF in 0xFE nista dovoljeni in ob dekodiranju takšne kode izpišite opozorilo o napaki.

Predstavitev UTF-8 kode:
1) V kolikor je prvi bit 0, sledi sedem bitov za kodo (tj. enako prvim 128 kodam iz ASCII).
2) V kolikor je prvi bit 1, pomeni število vodilnih bitov 1 do prvega bita 0, dolžino kode v bajtih. Nato se kodirajo bolj obteženi biti kode, šele nato manj obteženi del, po spodnji tabeli:
<table>
<tbody>
<tr>
<th>Unicode</th>
<th>Bajt 1</th>
<th>Bajt 2</th>
<th>Bajt 3</th>
<th>Bajt 4</th>
<th>Primer</th>
</tr>
<tr>
<td><code>U+0000–U+007F</code></td>
<td><code>0<em>xxxxxxx</em></code></td>
<td></td>
<td></td>
<td></td>
<td>'$' <code>U+00<span style="text-decoration: underline;">2</span>4</code>
→ <code>0<strong><span style="text-decoration: underline;">010</span>0100</strong></code>
→ <code>0x24</code></td>
</tr>
<tr>
<td><code>U+0080–U+07FF</code></td>
<td><code>110<em>yyyxx</em></code></td>
<td><code>10<em>xxxxxx</em></code></td>
<td></td>
<td></td>
<td>'¢' <code>U+00<span style="text-decoration: underline;">A</span>2</code>
→ <code>110<strong>000<span style="text-decoration: underline;">10</span></strong>,10<strong><span style="text-decoration: underline;">10</span>0010</strong></code>
→ <code>0xC2,0xA2</code></td>
</tr>
<tr>
<td><code>U+0800–U+FFFF</code></td>
<td><code>1110<em>yyyy</em></code></td>
<td><code>10<em>yyyyxx</em></code></td>
<td><code>10<em>xxxxxx</em></code></td>
<td></td>
<td>'€' <code>U+<span style="text-decoration: underline;">2</span>0<span style="text-decoration: underline;">A</span>C</code>
→ <code>1110<strong><span style="text-decoration: underline;">0010</span></strong>,10<strong>0000<span style="text-decoration: underline;">10</span></strong>,10<strong><span style="text-decoration: underline;">10</span>1100</strong></code>
→ <code>0xE2,0x82,0xAC</code></td>
</tr>
<tr>
<td><code>U+10000–U+10FFFF</code></td>
<td><code>11110<em>zzz</em></code></td>
<td><code>10<em>zzyyyy</em></code></td>
<td><code>10<em>yyyyxx</em></code></td>
<td><code>10<em>xxxxxx</em></code></td>
<td><code>U+<span style="text-decoration: underline;">0</span>2<span style="text-decoration: underline;">4</span>B<span style="text-decoration: underline;">6</span>2</code>
→ <code>11110<strong><span style="text-decoration: underline;">0</span>00</strong>,10<strong>10<span style="text-decoration: underline;">0100</span></strong>,10<strong>1011<span style="text-decoration: underline;">01</span></strong>,10<strong><span style="text-decoration: underline;">10</span>0010</strong></code>
→ <code>0xF0,0xA4,0xAD,0xA2</code></td>
</tr>
</tbody>
</table>
<p dir="ltr"><strong>PRIMER (vnos črke v desetiškem številskem sestavu, izpis kode v binarnem)
</strong></p>
<p dir="ltr"><span style="text-decoration: underline;">Kodiranje
</span><em>Vpišite Unicode znak: 65
UTF-8 koda Unicode je: 01000001</em>
<span style="text-decoration: underline;">Dekodiranje
</span><em>Vpišite UFT-8 kodo: 01000001
Koda predstavlja znak št.: 65</em></p>]]></description>
		<wfw:commentRss>http://www.koma.si/2010/03/ra-n03-utf-8/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

