Accesskey n skips to in page navigation. Skip to the content start
UniView is an XHTML-based application to look up characters, character blocks, paste in and discover unknown characters, store your own info about characters, search on character data, do hex/dec/ncr conversions, highlight character types, etc. etc. It supports Unicode 5.1.0 and is written with Web Standards to work on a variety of browsers
This help file relates to Version 5.1.0e of UniView. The major changes in this version relate mostly to usability improvements. In particular, new controls are provided for working with lists, and context-sensitive help is available if you click on a label for a control. For details see the change history.
The first three digits of the UniView version number reflect the version of Unicode that it supports. This version therefore supports Unicode version 5.1.0.
Either select a Unicode block from the pull-down list at the top left, or type it using hex numbers into the Custom
text box and click on
.
The Custom field will accept various formats. The numbers must be in hexadecimal form and separated by a colon (the default), a hyphen, one or more spaces, or one or more periods. The numbers can be in the following formats: 1234, ሴ, \1234;, \u1234, U+1234. The actual number of hex digits can be between 1 and 6.
You can display the result as either a table of characters or a list (that includes names) by clicking the checkbox lower down entitled Show range as list.
Unassigned character positions are shown in the matrix with a greyed out background (though you can change the colour, if you want).
Click on the checkbox next to Show as graphics to toggle between font glyphs and graphics. Using UTF-8 loads the page faster, but relies on you having a good Unicode font or selection of fonts to cover the Unicode code points. Graphics will be downloaded from the decodeunicode server by default.
If you are looking for fonts I recommend Wazu Japan's Gallery of Unicode Fonts or Alan Wood’s Unicode Resources. You should be able to find free fonts there for most characters. (You can also change the default font in UniView, if you wish.)
Type in the string you want to search for in the box labeled Search text and hit enter or click on
.
You can also use regular expressions in searches. For example, suppose you wanted to
find all characters with the word 'tet'. You could type into the input field, \btet\b. The \b represents a word boundary. If you wanted to search for entries containing either the word 'tet' or the word 'tat' you can use the 'or' operator | as in \btet\b|\btat\b. (In UniView a colon can be used as a short form of \b so the example could have been written :tet:).
Another example: You want to search for 'alpha', but you only want results for the Latin characters (not the many Greek or
mathematical results). Simply use the following search string latin.*alpha. The .* represents any number of intervening
characters.
I haven't tested this feature to destruction, but most basic regular expressions that work in both JavaScript and PHP code should work.
Note that by default searches match against character names and alternative names in the main Unicode database, and also searches the information displayed for an individual character under the heading Description in the right panel. You can limit the search using the Names, Descriptions and Other checkboxes under the search input field. Other refers to alternative names.
You can also limit the search to the specific range of characters currenlty in the lower left panel. To limit the search, select the checkbox labelled Local. Matching characters will be highlighted. (You can produce a list of just the highlighted characters by clicking on the icon next to Make list from highlights. If you need to refine your search, you can then search again on this list, and so on.)
Cut and paste the string into the box labeled Text area and hit enter or click on
.
You can also apply a particular font to the text in the Text area by using the Font field alongside.
Type or paste the hexadecimal codepoints in the box labeled Code points, and hit return or click on
. (See also the next point).
This field is very forgiving about the format of the text entered into the box. Most types of character escapes will be recognised, and you can even paste in surrounding text. For example, UniView will detect and list the characters referred to by codepoint in the text "the decomposition mapping is <U+CE20, U+11B8>, and not <U+110E, U+1173, U+11B8>." Of course, this is not foolproof, but should provide the desired results most of the time.
NEW By default, this feature will find and list hex code points. If you check the box labeled Decimal you can look up decimal code points instead.
Type the hex number in the box labeled Code points or the
character itself in the box labelled Text area, and click on
. The Unihan information will be displayed in a separate
page.
You can also find a link to the Unihan database in the detailed character information in the lower right panel when a CJK Unified Ideograph is displayed there.
Click on the icon
next to either the
Code points or the Text area boxes to open the conversion tool. If there is a code point value or a string of
characters in the box, values for those will be automatically shown when the conversion page opens.
Type the name of the font in the input field
labelled Font. Then hit enter or click on the
icon to the right. Characters for which there is a glyph in the
font will use it. The default is that no font is set.
You can also type a series of fonts, as per the usual CSS syntax, so that if one font is not available the next will be looked for (eg. 'Arial Unicode MS', sans-serif). If you want to use quotes, make them single quotes.
On Gecko-based and Opera browsers, font substitution will ensure that characters will be rendered if they are not in the font chosen but available in another font on your system. In Internet Explorer, if the font chosen doesn't have a glyph for a character, that character will not be displayed. (This can sometimes be useful for determining which characters are contained in a font.)
Note that a specific font only covers a certain range of Unicode characters. To return to the default font, empty the box and hit return or
.
If you don't have a font on your system that covers the characters shown, you can use the graphics switch.
This does not affect characters in the text area or in notes from the database.
(You can also specify your preferred default font).
Note that the font of the text in the Text area is set independently of this mechanism.
Because of some limitations in styling I was unable to work around, a fixed height of 600px is applied to the viewing area in the left panel. You can change this manually by clicking on Settings, then changing the value in the field set by default to '600px'. Don't forget to specify the 'px' or other measurement!
This is likely to only be useful when increasing the size of the characters on a monitor with a large number of vertical pixels.
Select a property from the Search properties drop down list. If the Local checkbox is selected this will show characters only in the specified custom range.
Click on the character. The details for that character appear on the right.
By default, the character will also be echoed to the Text area. You can turn the echo feature on and off by clicking on the
icon.
Click on the
icon, just below the text area to create a list in text form.
Clicking on any character will echo it to the Text area. You can turn the echo feature on and off by clicking on the
icon.
Double click on the character.
Click on the checkbox next to Show as graphics. (This only applies if the lower left panel was populated by selecting a block or a custom range.) Lists show character names and hex codepoints, in addition to the character.
Set the Local checkbox next to the Search text control to checked, then specify what you want to search for in the Search text input box. Search strings can be regular expressions, and you can specify what aspects of the information about a character are searched (see above for details). Characters that match will be highlighted.
Select the type of property you want from the Show properties selection list, after ensuring that the adjacent Local checkbox is checked. Characters with the selected property will be highlighted. Properties available include general category and combining class or directionality.
Mouse over More actions and click on the icon next to the Make list from highlights control. The characters shown in the lower left panel will be reduced to a list of just those that were highlighted. (This is particularly useful for refining searches.)
NEW Mouse over More actions and click on the icon next to the Make list from non-highlighted items control. The characters shown in the lower left panel will be reduced to a list of just those that were not highlighted. (This is particularly useful for refining searches.)
NEW Mouse over More actions and click on the icon next to the Remove unassigned characters from list control. (This is particularly useful for creating a list of all the characters in a Unicode block.)
Mouse over a character and the decimal code point value pops up in a tooltip.
Click on Settings at the top right of the menu panels, then select a zoom factor from the pull down menu next to the label Left panel size that contains the text "100%".
[Note: Increasing or decreasing a browser's text zoom can multiply the effect of the selector. As sizes are mostly in pixels now, rather than ems, you can only apply this extra magnification in IE6 by selecting the accessibility options.]
NEW Click on Settings at the top right of the menu panels, then toggle the checkbox labelled Show U+ in lists.
Click on Settings at the top right of the menu panels, then toggle the checkbox labelled Hide numbers around matrix.
Double click on the hex number, and release the mouse button. Then click on the highlighted text and drag and drop or copy and paste the Hex number to the area with a yellow background towards the right of the menu panels. The character will be displayed just above as you move your cursor out of the yellow area.
Click on the
button at the top of the detailed
information pane.
Click on the
button at the top of the detailed
information pane.
Select the DB checkbox to enable this. When you view character details for which notes exist you will automatically see those notes.
The notes are stored in a database compiled by myself, and are continuously growing.
The notes are included using AJAX.
If you set up your bookmark to UniView to include database=on this feature will be on by default when you start UniView.
This replaces the previous file-based method of notes inclusion. The files used previously were just copies of parts of the database. Now there is no need to load in notes for different ranges of characters. All notes are available at any time.
NEW Click on the CLDR's Property demo link. A new window will open to show the entry for that character in the CLDR database. This provides additional, less commonly used data and properties relating to the character.
Click on the decodeUnicode link. A new window will open to show the entry for that character in the decodeUnicode database. decodeUnicode is a wiki where people can provide information about characters.
decodeUnicode.org is a wiki where people can contribute information about Unicode blocks and characters. It is developed at the Department of Design at the University of Applied Sciences in Mainz. The project is supported by the Federal Ministry of Education and Research (BMBF) and has the objectives of creating a basis for fundamental typographic research and facilitating a textual approach to the characters of the world for all computer users. (They also provide the graphic versions of characters for UniView.)
Click on the FileFormat link. A new window will open to show the entry for that character in the FileFormat database.
The FileFormat pages provide useful information for Java and .Net programmers.
Click on the Conversion tool link. A new window will open to show a number of possible alternative representations of the character, eg. numeric character entity references, percent escaped forms, hex and decimal codepoint information, etc.
Click on the link next to the subheading Script group and that block will be opened in the lower left panel.
The Text area is a set of controls for managing characters as text. By default, any character you click on in the lower left panel is echoed to the text area. You can toggle this feature on and off by clicking on the
icon.
Alternatively, you can paste text into this area, or edit it directly. The
icon allows you to add spaces with a click.
The insertion point for characters echoed to the text area can be changed in most browsers by just clicking where you want characters to appear (but not Internet Explorer, where characters are always added to the end of the line). You can also highlight a range of text and any typed or echoed characters will replace the highlighted range.
The
icon is provided to simplify copying the text in the text area. It highlights all the text in the text area. (Particularly useful to check you have caught all combining characters.)
You can display in a lower left panel a list of the characters in the text area, with names and codepoints, by clicking on the
icon. This is particularly useful for investigating text with characters you can't see or correctly identify. Simply paste the text into the text area and click on
, and UniView will produce a list of the names and codepoints for all the characters.
Conversely, clicking on the
icon will copy to the text area all the characters currently displayed in the lower left panel. This is particularly useful for capturing search results, or making a list of all characters in a block, etc.
NEW You can convert the text to Unicode Normalization Form C or D (NFC or NFD) by clicking on either
or
. The change may not be obvious, but if you click on
you should see any changes to the text listed below.
You can change the font of the text in the text area using the provided input field. This can help when the text is not supported by the default font used by UniView.
You can also convert characters to various escape or other forms by clicking on
, or look up han characters in the UniHan database by clicking on
. The
icon clears all text from the text area.
If you click on the label of any of the controls listed below, you will open this document at the appropriate place for an explanation of that control.
Select a Unicode block from the pull-down list and the characters in the block will be displayed in the lower left panel. You can then click on characters to view detailed information about them or add them to the text area, etc.
You can display the result as either a matrix or a list (that includes names) by clicking the checkbox lower down entitled Show range as list.
Unassigned character positions are shown in the matrix with a greyed out background (though you can change the colour, if you want).
You can also specify a custom range by typing or pasting a hex codepoint range into the Custom box alongside this control.
If you type or paste a start and end code point value (in hex) into this control, the characters in the range will display in the lower left panel. Note that this can only be one contiguous range.
You can display the result as either a matrix or a list (that includes names) by clicking the checkbox lower down entitled Show range as list.
If the range you select does not fill a whole column when displayed as a matrix, surrounding characters are greyed out. (When displaying as a list, you will only see the characters in the range.)
The Custom field will accept various formats, making it easier to paste a range from elsewhere. The numbers must be in hexadecimal form but can be separated by either a colon (the default), a hyphen, one or more spaces, or one or more periods. The code point values themselves can be in the following formats: 1234, ሴ, \1234;, \u1234, U+1234. The actual number of hex digits can be between 1 and 6.
Add a list of hex code point values to this control and they will display in a list below.
You can also work with decimal code point values if the Decimal checkbox is selected.
You can do a two other things with these code point values, in addition to listing the characters below.
to convert the code points to various escaped or other forms, using the conversion tool.
to look up a character in the UniHan database. (Only the first character will be looked up.)Click on
to quickly clear the control.
This field is very forgiving about the format of the text entered into the box. Most types of character escapes will be recognised, and you can even paste in surrounding text. For example, UniView will detect and list the characters referred to by codepoint in the text "the decomposition mapping is <U+CE20, U+11B8>, and not <U+110E, U+1173, U+11B8>." Of course, this is not foolproof, but should provide the desired results most of the time.
This control allows you to search for text in the Unicode database, and returns a list of matching characters.
You can use regular expressions in searches. For example, suppose you wanted to
find all characters with the word 'tet'. You could type into the input field, \btet\b. The \b represents a word boundary. If you wanted to search for entries containing either the word 'tet' or the word 'tat' you can use the 'or' operator | as in \btet\b|\btat\b. (In UniView a colon can be used as a short form of \b so the example could have been written :tet:).
Another example: You want to search for 'alpha', but you only want results for the Latin characters (not the many Greek or
mathematical results). Simply use the following search string latin.*alpha. The .* represents any number of intervening
characters.
I haven't tested this feature to destruction, but most basic regular expressions that work in both JavaScript and PHP code should work.
Note that by default searches match against character names and alternative names in the main Unicode database, and also searches the information displayed for an individual character under the heading Description in the right panel. You can limit the search using the Names, Descriptions and Other checkboxes under the search input field. Other refers to alternative names.
You can also limit the search to the specific range of characters currently in the lower left panel. To limit the search, select the checkbox labelled Local. Matching characters will be highlighted. (You can then produce a list of just the highlighted characters by clicking on the icon next to More actions > Make list from highlights. If you need to refine your search, you could then search again on this list, and so on.)
This control allows you to search for characters with a particular property. It lists matching characters below.
By default searches match against the characters currently listed in the lower left panel. Matching characters will be highlighted. (You can then produce a list of just the highlighted characters by clicking on the icon next to More actions > Make list from highlights. If you need to refine your search, you could then search again on this list, and so on.)
To enlarge the search to the whole of Unicode, deselect the checkbox labelled Local.
Unless you have specified a default font in the query of the URL you used to call UniView, the default is for no font to be set. Most browsers will look for available fonts on your system to display the characters.
You can use this control to explicitly change the font to one you have on your system. Simply type the name of the font in the box and hit return or
.
You can also type a series of fonts, as per the usual CSS syntax, so that if one font is not available the next will be looked for (eg. 'Arial Unicode MS', sans-serif). If you want to use quotes, make them single quotes.
On Gecko-based and Opera browsers, font substitution will ensure that characters will be rendered if they are not in the font chosen but available in another font on your system. In Internet Explorer, if the font chosen doesn't have a glyph for a character, that character will not be displayed. (This can sometimes be useful for determining which characters are contained in a font.)
Note that a specific font only covers a certain range of Unicode characters. To return to the default font, empty the box and hit return or
.
If you don't have a font on your system that covers the characters shown, you can use the graphics switch.
This does not affect characters in the text area or in notes from the database.
(You can also specify your preferred default font).
If the checkbox is selected, all characters except those in the text area and notes from the database will be shown as graphics, rather than text. The graphics are downloaded from the decodeunicode server.
If Unicode has recently issued a new version, it may take a while for the new characters to become available.
If the checkbox is selected, when you select a range using Show range or Custom the characters will be displayed as a list, rather than a matrix.
You can also use this to switch between matrix and list views of a range you have just selected.
When you click on a character listed in the lower left panel, detailed information about that character is displayed lower right. When the DB checkbox is selected, additional notes about a character are displayed (where available) at the bottom of the lower right hand panel. When you view character details for which notes exist you will automatically see those notes.
The notes are stored in a database compiled by myself. The information changes from time to time, as I add to or adapt the information in the database.
If you set up your bookmark to UniView to include database=on this feature will be on by default when you start UniView.
If you have highlighted items in a list in the lower left panel, using the Search text or Search properties controls, this control will remove all but the highlighted items from the list.
If you have highlighted items in a list in the lower left panel, using the Search text or Search properties controls, this control will remove all the highlighted items from the list, leaving the non-highlighted items only.
If you have unassigned characters in a list in the lower left panel this control will remove them from the list.
If this is checked, hex code point numbers in lists in the lower left panel will be preceded by U+. The default is just the number.
This allows you to change the order and items in lists appearing in the lower left panel. By default, you would see something like this:
0968 २ DEVANAGARI DIGIT TWO
With this control you can position the character before or after the number (or both!) or remove it altogether. You can also specify whether the list should show the number and/or the name of the character.
This control is provided for people who want some control over how the list will look when copied and pasted into their text.
This allows you to hide the column and row numbers around a matrix. The default is to show the numbers.
Because of some limitations in styling I was unable to work around, a fixed height of 600px is applied to the viewing area in the left panel. This control allows you to change the height. Don't forget to specify the 'px' or other measurement!
This is likely to only be useful when increasing the size of the characters on a monitor with a large number of vertical pixels.
This control allows you to increase the size of the characters in the lower left panel (independently of text elsewhere on the page).
This is useful for pointing people to particular information using a URI, for example in email. By providing query parameters in the URI you can start up UniView with specific information displayed as follows:
You should only use one of these query parameters in a single call to UniView, with the exception of char=, which can be used with any
of the others.
Eg. http://rishida.net/scripts/uniview/?block=latin%20extended-b&char=01C5
You can also start up UniView with character notes as follows:
uniview/?database=on This will automatically load notes from my character database when you view character details in the lower right panel. You can combine this parameter with any other. For more information about notes, see "display additional notes about characters where available" above.
eg. http://rishida.net/scripts/uniview/?block=thai&database=on
By providing query parameters when you call UniView you can modify the default settings for look and feel as follows:
You can use all or none of these query parameters in a single call to UniView.
If you store your bookmark with these parameters set, you will always open UniView with your preferences.
François Yergeau co-developed the Unicode Code Converter utility, and translated it into French.
Patrick Andries translated UniView into French, but that was many versions ago, and the French version is no longer available.
Custom ranges are no longer enlarged to fill full columns in the matrix. Full columns are still shown, but characters not in the range specified are greyed out. When displaying the range as a list, only the characters in the specified range appear.
Context-sensitive help was added. The labels for controls link to a new section in this document where the controls are explained one by one. To get the help, click on the label.
Some changes were made to help those who want to copy and paste lists of characters to other documents. There are new settings that make it possible to automatically prefix hex code point numbers in lists with U+, if you prefer, and tailor what appears in the list, and where the character is shown. (You can also set U+ to appear by default by including u=yes in the URI you use to call UniView.) These controls are accessed by clicking on Settings (used to be called Options).
A control was added to remove unassigned characters from a list. This can be useful, for example, if you want a list of all characters in a block.
Another control was added that removes the highlighted characters from a list. This and the previous control were moved into a popup that opens as you mouse over the text More actions.
The default font was set to nothing, and the Font control was moved from the settings popup to the main control area. To reset the default font, simply delete the last font name in the font control and hit return.
If an unassigned character is displayed in the right panel, it is now possible to display the group it belongs to on the left. Hex code point numbers for unassigned characters in lists are now a minimum of 4 characters long. Unassigned characters now have a grey background in all lists. DB notes are no longer reported for unassigned characters (bugfix).
A couple more minor user interface changes.
A major feature change is the addition of buttons to the Text area to allow conversion of the text to NFC or NFD normalization forms. (You may not notice the change until you list the characters.)
The control panel was also substantially rearranged again to hopefully make it easier for newcomers to see what they can do.
The Code point conversion feature was upgraded to handle decimal code point values.
A single character in the codepoints area or text area is now listed in the lower left panel when you click on
, rather than in the right-hand properties panel. This is to improve consistency and avoid surprises.
Added a link to the CLDR property demo from the right panel to give access to additional properties.
Improved the parsing of codepoints when surrounded by text in the Code point input field, so that it now works with &#x...; and \u... and \U... escapes.
Jettisoned some unneeded code to reduce download by around 40-50K bytes. Implemented the NFC/NFD feature using AJAX, to avoid putting the download size back up.
When you delete the contents of the text area or the code point area, the associated input field is given focus, so you are ready for input.
A couple more minor bug fixes.
Removed the two Highlight selection boxes.These used to highlight characters in the lower left panel with a specific property value. The Show selection box on the left (used to be Show list) now does that job if you set the Local checkbox alongside it. (Local is the default for this feature.)
As part of that move, the former SiR (search in range) checkbox that used to be alongside Custom range has been moved below the Search for input field, and renamed to Local. If Local is checked, searching can now be done on any content in the lower left panel, and the results are shown as highlighting, rather than a new list.
To complement these new highlighting capabilities, a new feature was added. If you click on the icon next to Make list from highlights the content of the lower left panel will be replaced by a list of just those items that are currently highlighted - whether the highlighting results from a search or a property listing. Note that this can also be useful to refine searches: perform an initial search, convert the result to a list, then perform another search on that list, and so on.
Finally got around to putting
icons after the pull-down lists. This means that if you want to reapply, say, a block selection after doing something else, only one click is needed (rather than having to choose another option, then choose the original option). The effect of this on the ease of use of UniView is much greater than I expected.
Added an icon
to the text area. If you click on this, all the characters in the lower left panel are copied into the text area. This is very useful for capturing the result of a search, or even a whole block. Note that if a list in the lower left panel contains unassigned code points, these are not copied to the text area.
As a result of the above changes, the way Show as graphics and Show range as list work internally was essential rewritten, but users shouldn't see the difference.
Changed the label Character area to Text area.
Moved the cut&paste field downwards, made it larger, and changed the label to character area. This should make it easier to deal with text copy/cut & paste, and more obvious that that is possible with UniView. It is much clearer now that UniView provides character map/picker functionality, and not just character lookup.
Whereas previously you had to double-click to put a character in the lower left pane into the Cut&paste field, UniView now echoes characters to the Character area every time you (single) click on a character in the lower left hand pane. This can be turned off. Double-clicking will still add the codepoint of a character in the lower left panel to the Code points field.
The Character area has its own set of icons, some of which are new: ie. you can select the text, add a space, and change the font of the text in the area (as well as turn the echo on and off). I also spruced up the icons on the UI in general.
Note that on most browsers you can insert characters at the point in the Character area where you set the cursor, or you can overwrite a highlight range of characters, whereas (because of the non-standard way it handles selections and ranges) Internet Explorer will always add characters to the end of the line.
The Code points field has also been enlarged, and I moved the Show list pull-down to the left and Show as graphics and Show page as list to the right. This puts all the main commands for creating lists together on the left.
When you mouse over character in the lower left pane you now see both hex and decimal codepoint information. (Previously you just saw an unlabelled decimal number.) You will also find decimal code point values for characters displayed in the lower right panel.
Fixed a bug in the Code points input feature so that trailing spaces no longer produce errors, but also went much further than that. You can now add random text containing codepoints or most types of hex-based escaped characters to the input field, and UniView will seek them out to create the list. For example, if you paste the following into the Code points field:
the decomposition mapping is <U+CE20, U+11B8>, and not <U+110E, U+1173, U+11B8>.
the result will be:
CE20: 츠 [Hangul Syllables]
11B8: ᆸ HANGUL JONGSEONG PIEUP
110E: ᄎ HANGUL CHOSEONG CHIEUCH
1173: ᅳ HANGUL JUNGSEONG EU
11B8: ᆸ HANGUL JONGSEONG PIEUP
Of course, UniView is not able to tell that an ordinary word like 'Abba' is not a hex codepoint, so you obviously need to watch out for that and a few other situations, but much of the time this should make it much easier to extract codepoint information.
I still haven't found a way to fix the display bug in Safari and Google Chrome that causes initial content in the lower left pane to be only partially displayed.
A large amount of code was rewritten to enable data to be downloaded from the server via AJAX at the point of need. This eliminates the long wait when you start to use UniView without the database information in your cache. This means that there is a slightly longer delay when you view a new block, but the code is designed so that if you have already downloaded data, you don't have to retrieve it again from the server.
The search mechanism was also rewritten. The regular expressions used must now be supported in both JavaScript and PHP (PHP is used if not searching within the current range). When 'other' is ticked, the search will look in the alternative name fields, but not in other property settings (so you can no longer use something like ;AL; to search for characters with a particular property. (Use 'Show list' instead.)
Removed several zero-width space characters from the code, which means that UniView now works with Google Chrome, except for some annoying display bugs that I'm not sure how to fix - for example, the first time you try to display any block you only seem to get the top line (although, if you click or drag the mouse, the block is actually there). This seems to be WebKit related, since it happens in Safari, too.
Updated to cover Unicode Version 5.1.0.
Added <option value="(;R;)|(;AL;)">Right-to-left (R or AL)</option> to property lister.
Bugfix: fixed ranges supplied via URI query (used to still split).
Changed the custom range input to a single field that will accept various range formats.
Added the ability to select whether Search looks at any combination of character names only, other parts of a record in the Unicode database, or the other character description information, and added a message to say how many characters were matched.
Added the ability to search within the range specified in the field entitled Range.
Added the ability to list characters with a given General or Bidirectional property (within a specified range or not).
Added an AJAX link to my database of information about Unicode characters. If enabled, using the DB checkbox, this automatically retrieves any available data for a character when information about that character is displayed in the lower right panel. You can also specify that UniView should open with that set as the default using database=on in the URI used to call UniView.
Because of the previous improvement, I removed the ability to link in a file of information about characters. (The information in the files was a copy of the information in the database.)
Moved the Code point(s) and Cut & paste fields lower, to make them easier to use.
Fixed a bug that was preventing the Search function finding characters in the Basic Latin block.
Bugfix: a range like 0036:0067 will always show full rows now; a range with start higher than end will show alert.
Added reference to decodeunicode when graphics are displayed in left column
Bugfix: search parameter won't break when graphics etc toggled
You can now specify windowHeight parameter at startup in the URI's query string.
Extended the ability to open UniView with data displayed from a URI. In addition to specifying a block and a character, you can now specify a range, a list of codepoints, a list of characters, or a search string. This is useful for pointing people to results using URIs in links or email.
Switching between graphics or fonts for display of characters now refreshes the right panel also.
Clicking on the information about the script group of a character displayed in the right panel will cause that block to be displayed in the left panel. This is particularly useful when you find a single character and want to know what's around it.
Replaced the use of hyphens to specify block names in URI queries with underscores or %20. This may break some existing URIs, but fixes a bug that meant that block names that actually contain hyphens were not displaying.
Added an option to the right hand panel to display the current character in the Unicode Conversion tool.
Fixed some other bugs related to specifying Basic Latin block in a URI.
Reinstated CJK Unified Ideographics and Hangul Syllables in the block selection pull-down, but added a warning and opt out if the block you are about to display contains more than 2000 characters. Also added warning and opt out if you try to specify a range of over 2000 characters.
Substantially revised the code so that UniView now handles ideographic and hangul characters and other characters not in the Unidata database. For example, ideographs now display in the left panel for a specified range and property values are available in the right panel.
Added regular expression support to the search input field.
Changes to the user interface: moved highlighting controls to the initial screens and move others, such as the chart numbering toggle, to the submenu under "Options"; provided wider input fields for codepoint and cut&paste input; replaced the graphics and list toggle icons with checkboxes; provided an icon to quickly clear the contents of the codepoint and cut&paste input fields. A link to the UniHan database was added alongside the Cut & paste input field: when clicked, this icon looks up the first character in either field. A link to the UniHan database was also added to the right panel when a Unified CJK character is displayed there.
The Codepoint input field now accepts more than one codepoint (separated by spaces).
When you double-click on a character in the left panel the codepoint is appended to the Codepoint input field as well as adding the character to the Cut & paste field.
When you click in the checkbox Show as graphics the change is immediately applied to whatever is in the left panel. It no longer redisplays the range if you are looking at, say, a list of characters generated by the Codepoint input, but redisplays the same list.
Set the default font to "Arial Unicode MS, sans-serif".
Added a message for those who do not have JavaScript turned on, and messages to please wait while data is being downloaded on initial startup.
Fixed the icons linking to the converter tool, so that the contents of the adjacent field are passed to the converter and converted automatically.
Added links in the right panel to FileFormat pages (in addition to decodeUnicode). The FileFormat pages provide useful information for Java and .Net users about a given character.
Removed the option to specify your own character notes (I'm not aware that anyone ever did, since it hasn't worked for a while now and no-one has complained). This is because AJAX technology will not allow an XML file to be included from another domain. When that is fixed I will reinstate it.
Fixed a number of other bugs, particularly related to supplementary character support and highlighting.
Updated to support Unicode 5.0.0.
Restyled the menu panels, moving some less used functions to pop up windows to save on horizontal space.
Implemented an AJAX approach for incorporating notes files. This means that the page no longer has to be reloaded to add notes. It is now also possible to add more than one set of notes at a time. Note that these changes requires a small change to the markup of notes files - the div containing the notes for display has to have a class name 'notes' as well as the id for the character.
I added some bundled notes files - most notably myanmar. Note that these are subject to change on an ongoing basis.
Most of the properties display in the character-detail panel on the right are taken from the unicodedata file at the moment. I plan to incorporate additional property information over the coming months, but wanted to release this now so that you can get information about Unicode 5 characters sooner rather than later.
Added a link to the decodeUnicode wiki for each character that is displayed in the right-hand panel.
Provided a way to start up UniView with a particular block and/or character displayed as a table in the lower panels. This should be particularly useful for pointing a person to a particular Unicode block or character in a URI.
Fixed a couple of minor bugs in the CSS.
Rearranged the top of the page to allow UniView to be used in narrower windows.
Added support for Unicode version 4.1.0.
Retrieves graphics from decodeunicode.org rather than the slow-loading and sparse graphics that were available from the Unicode site. Also added my own graphics where decodeunicode has gaps.
Moved the files to PHP. This enables a different approach to the inclusion of user-defined notes that now works on IE and Opera, too.
Another benefit of using PHP is that you can now prep the conversion page with data in the 'Code point' or 'Cut & paste' fields. By clicking on the appropriate icon, the conversion page will now open with the conversions already done for the relevant field.
Yet another benefit of PHP is that, if you really want to, you can now set various preferences related to the intial look and feel by specifying them as query parameters when you call UniView.
NOTE: If you want to be able to download UniView to your hard drive and you don't have a server and PHP, let me know. If enough people ask for it, I will create a downloadable zipped package again that will work without PHP (and without the additional notes feature). I will also post notes on how to customise various aspects of the setup.
Note also that I have disabled links to the French version until and new French translation has been prepared. I will probably not do language based content-negotiation.
Surrogate support added.
You can now double-click on any line in a list on the left, and the character will appear in the Cut&Paste field above.
Han and Hangul character glyphs are now displayed in the right panel after entering a codepoint in the Code Point field. There may not be much information available, but at least you can see the character if you have a font that supports it.
Minor improvements to user interface, including provision of tooltips for all feature selectors.
Disabled (attempts to) display user-defined notes for IE and Opera. I still haven't found how to make it work yet, even using proprietary coding, but at least the attempt won't crash the browser now.
Provided a facility to allow visible area in left panel to be increased.
Name changed to UniView.
Support for Unicode 4.0.0.
No frames. Cross-browser support.
You can specify your preferred default font for display of Unicode characters in prefs.js. If an alternative font is applied using the control on the page, it remains in force for any view until the user sets it back to the default.
Highlighting of General or Bidi properties remains in force until you disabled it, and applies to any matrix or list in the left panel (ie. including search results and cut & paste results).
Script blocks are now grouped with visible labels in the main range-selection pulldown.
Mousing over a character in matrix or list view produces a tool tip containing the decimal code value for the character. In the previous version this was the Hex value, and was limited to the matrix view.
There are no facilities to display information in a pop-up window instead of in the main window. If you want to temporarily display information separately, open a new window.
You used to be able to double-click on a list or on the character descriptions to make the highlighted text appear in various fields. This has not been implemented, but you can still highlight and drag, or copy and paste the text.
Because character sizes are specified in pixels for cross-browser consistency, you must use IE's accessibility options to increase character size in IE over and above what is available from the font size setting provided on the page.
Options for displaying in page descriptions of script blocks have been disabled. Open the files in a separate tab or window as a standalone file.