Unicode Nearly Plain Text Encoding of Mathematics-Linux大棚

admin 管理员组

文章数量: 1184232

2024年1月23日发(作者：大疆开启fcc教程)

Unicode Nearly Plain Text Encoding of Mathematics

Unicode Nearly Plain-Text Encoding of Mathematics

Version 3

Murray Sargent III

Publisher Text Services, Microsoft Corporation

10-Mar-10

Introduction ............................................................................................................ 2

Encoding Simple Math Expressions ...................................................................... 3

2.1

Fractions .......................................................................................................... 4

2.2

Subscripts 6

2.3

Use of the Blank (Space) Character ............................................................... 7

Encoding Other Math Expressions ........................................................................ 8

3.1

Delimiters ........................................................................................................ 8

3.2

Literal Operators ........................................................................................... 10

3.3

Prescripts and Above/Below Scripts ........................................................... 11

3.4

n-ary Operators ............................................................................................. 12

3.5

Mathematical Functions ............................................................................... 13

3.6

Square Roots and Radicals ........................................................................... 13

3.7

Enclosures ..................................................................................................... 14

3.8

Stretchy Characters ....................................................................................... 15

3.9

Matrices ......................................................................................................... 16

3.10

Accent Operators ....................................................................................... 16

3.11

Differential, Exponential, and Imaginary Symbols ................................. 17

3.12

Unicode Subscripts and Superscripts ...................................................... 18

3.13

Concatenation Operators .......................................................................... 18

3.14

Comma, Period, and Colon ........................................................................ 18

3.15

Ordinary Text Inside Math Zones ............................................................. 19

3.16

Space Characters ....................................................................................... 19

3.17

Phantoms and Smashes ............................................................................ 21

3.18

Arbitrary Groupings .................................................................................. 22

3.19

Equation Arrays ......................................................................................... 22

3.20

Math Zones ................................................................................................. 22

3.21

Equation Numbers .................................................................................... 23

3.22

Linear Format Characters and Operands ................................................ 23

3.23

Equation Breaking and Alignment ........................................................... 26

3.24

Size Overrides ............................................................................................ 26

Input Methods ...................................................................................................... 27

4.1

Character Translations ................................................................................. 27

4.2

Math Keyboards ............................................................................................ 29

4.3

Hexadecimal Input ........................................................................................ 29

4.4

Pull-Down Menus, Toolbars, Context Menus .............................................. 29

4.5

Macros ............................................................................................................ 30

4.6

Linear Format Math Autocorrect List .......................................................... 30

4.7

Handwritten Input ........................................................................................ 30

Recognizing Mathematical Expressions ............................................................. 31

Unicode Technical Note 28

Unicode Nearly Plain Text Encoding of Mathematics

Using the Linear Format in Programming Languages ....................................... 32

6.1

Advantages of Linear Format in Programs ................................................. 33

6.2

Comparison of Programming Notations ..................................................... 34

6.3

Export to TeX ................................................................................................. 36

Conclusions ........................................................................................................... 37

Acknowledgements ..................................................................................................... 37

Appendix A. Linear Format Grammar ....................................................................... 38

Appendix B. Character Keywords and Properties .................................................... 39

Version Differences ..................................................................................................... 48

References .................................................................................................................... 48

1. Introduction

Getting computers to understand human languages is important in increasing

the utility of computers. Natural-language translation, speech recognition and gen-eration, and programming are typical ways in which such machine comprehension

plays a role. The better this comprehension, the more useful the computer, and

hence there has been considerable current effort devoted to these areas since the

early 1960s. Ironically one truly international human language that tends to be ne-glected in this connection is mathematics itself.

With a few conventions, Unicode1 can encode many mathematical expressions

in readable nearly plain text. Technically this format is a “lightly marked up format”;

hence the use of “nearly”. The format is linear, but it can be displayed in built-up

presentation form. To distinguish the two kinds of formats in this paper, we refer to

the nearly plain-text format as the linear format and to the built-up presentation

format as the built-up format. This linear format can be used with heuristics based

on the Unicode math properties to recognize mathematical expressions without the

aid of explicit math-on/off commands. The recognition is facilitated by Unicode’s

strong support for mathematical symbols.2 Alternatively, the linear format can be

used in “math zones” explicitly controlled by the user either with on-off characters

as used in TeX or with a character format attribute in a rich-text environment. Use of

math zones is desirable, since the recognition heuristics are not infallible.

The linear format is more compact and easy to read than [La]TeX,3,4 or

MathML.5 However unlike those formats, it doesn’t attempt to include all typograph-ical embellishments. Instead we feel it’s useful to handle some embellishments in

the higher-level layer that handles rich text properties like text and background col-ors, font size, footnotes, comments, hyperlinks, etc. In principle one can extend the

notation to include the properties of the higher-level layer, but at the cost of re-duced readability. Hence embedded in a rich-text environment, the linear format

can faithfully represent rich mathematical text, whereas embedded in a plain-text

environment it lacks most rich-text properties and some mathematical typograph-ical properties. The linear format is primarily concerned with presentation, but it

has some semantic features that might seem to be only content oriented, e.g., n-2

Unicode Technical Note 28

Unicode Nearly Plain Text Encoding of Mathematics

aryands and function-apply arguments (see Secs. 3.4 and 3.5). These have been in-cluded to aid in displaying built-up functions with proper typography, but they also

help to interoperate with math-oriented programs.

Most mathematical expressions can be represented unambiguously in the line-ar format, from which they can be exported to [La]TeX, MathML, C++, and symbolic

manipulation programs. The linear format borrows notation from TeX for mathe-matical objects that don’t lend themselves well to a mathematical linear notation,

e.g., for matrices.

A variety of syntax choices can be used for a linear format. The choices made in

this paper favor a number of criteria: efficient input of mathematical formulae, suffi-cient generality to support high-quality mathematical typography, the ability to

round trip elegant mathematical text at least in a rich-text environment, and a for-mat that resembles a real mathematical notation. Obviously compromises between

these goals had to be made.

The linear format is useful for 1) inputting mathematical expressions,6 2) dis-playing mathematics by text engines that cannot display a built-up format, and 3)

computer programs. For more general storage and interchange of math expressions

between math-aware programs, MathML and other higher-level languages are pre-ferred.

Section 2 motivates and illustrates the linear format for math using the fraction,

subscripts, and superscripts along with a discussion of how the ASCII space U+0020

is used to build up one construct at a time. Section 3 summarizes the usage of the

other constructs along with their relative precedences, which are used to simplify

the notation. Section 4 discusses input methods. Section 5 gives ways to recognize

mathematical expressions embedded in ordinary text. Section 6 explains how

Unicode plain text can be helpful in programming languages. Section 7 gives conclu-sions. The appendices present a simplified linear-format grammar and a partial list

of operators.

2. Encoding Simple Math Expressions

Given Unicode’s strong support for mathematics2 relative to ASCII, how much

better can a plain-text encoding of mathematical expressions look using Unicode?

The most well-known ASCII encoding of such expressions is that of TeX, so we use it

for comparison. MathML is more verbose than TeX and some of the comparisons ap-ply to it as well. Notwithstanding TeX’s phenomenal success in the science and engi-neering communities, a casual glance at its representations of mathematical expres-sions reveals that they do not look very much like the expressions they represent.

It’s not easy to make algebraic calculations by hand directly using TeX’s notation.

With Unicode, one can represent mathematical expressions more readably, and the

resulting nearly plain text can often be used with few or no modifications for such

calculations. This capability is considerably enhanced by using the linear format in a

system that can also display and edit the mathematics in built-up form.

Unicode Technical Note 28

Unicode Nearly Plain Text Encoding of Mathematics

The present section introduces the linear format with fractions, subscripts, and

superscripts. It concludes with a subsection on how the ASCII space character

U+0020 is used to build up one construct at a time. This is a key idea that makes the

linear format ideal for inputting mathematical formulae. In general where syntax

and semantic choices were made, input convenience was given high priority.

2.1 Fractions

One way to specify a fraction linearly is LaTeX’s frac{numerator}{denominator}.

The

{ } are not printed when the fraction is built up. These simple rules immediately

give a “plain text” that is unambiguous, but looks quite different from the corre-sponding mathematical notation, thereby making it harder to read.

Instead we define a simple operand to consist of all consecutive letters and

decimal digits, i.e., a span of alphanumeric characters, those belonging to the Lx and

Nd General Categories (see The Unicode Standard 5.0,1 Table 4-2. General Category).

As such, a simple numerator or denominator is terminated by most nonalphanumer-ic characters, including, for example, arithmetic operators, the blank (U+0020), and

Unicode characters in the ranges U+2200..U+23FF, U+2500..U+27FF, and U+2900 ..

U+2AFF. The fraction operator is given by the usual solidus / (U+002F). So the sim-ple built-up fraction

本文标签：开启大疆教程作者

版权声明：本文标题：Unicode Nearly Plain Text Encoding of Mathematics 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://roclinux.cn/p/1705975422a496357.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

Linux大棚 – 不忘初心的技术博客，浮躁时代的安静角落

Unicode Nearly Plain Text Encoding of Mathematics

更多相关文章

计算机必须设置默认打印机,电脑系统怎么默认打印机 默认打印机的设置教程...

W11下载安装教程

Realtek驱动重装步骤：超详细版教程

Multisim14.3安装教程：Win10Win11兼容性配置指南

小白必学教程：解决Chrome浏览器的网页加载难题

笔记本键盘的秘密武器：fn键与f键的完美联动，你get了吗？

MSTSC教程：快速学会远程控制他人PC的技巧

新手必学教程：Win7中如何优雅地隐藏UAC警告

崩溃模式开启！教你快速修复笔记本电脑键盘故障的神技能！

笔记本与Wi-Fi的不解之缘：如何完美搭建无线连接环境？

告别手动操作！用批处理让笔记本自动连接WIFI

VMware workstation 12新手安装教程：快速解锁虚拟机操作技巧。

VMware Workstation 12快速部署指南：一键安装不求人！

Ubuntu教程：立刻学会快速显隐桌面的简便方法

手机做电脑的网络桥接器，为什么老是没反应？解答在这里！

一步到位：轻松掌握微信视频号在电脑上发布链接的实用方法！

Windows激活密钥教程：从新手到高手

铭瑄主板VT功能开启步骤详解：打造高效计算环境不是梦

XMP内存技术实操指南：提升电脑性能不再难

XMP内存技术大揭秘：提升你的游戏和日常使用效率，从此告别卡顿！

发表评论

推荐文章

nDoc、framework2.0_ndoc与构建版本2.0：一场兼容性之战

笔记本显卡选购指南

未处理的“System.Runtime.InteropServices.COMException”类型的异常出现在 comlayout.exe 中。其他信息: 检索 COM 类工厂中 CLSID 为 {0EC8CCC8-EBED-495E-9A9F-313

2011-6-22精品软件【清风网络整理】_kmplayer plus 20090725

电脑WiFi显示无网，手机却连得飞起，这是怎么回事？

热门文章

TP Link TL-WR702N 路由器后台访问不通？试试这几个简单步骤！

跨入.NET Framework 2.0中文版世界：快速掌握软件开发

vcruntime140_1.dll——修复vcruntime140_1.dll方法解析_vcruntime140-1.dll

重启好多次路由器，还是上不了网怎么办？_路由器连接上但上不了网

查看电脑ip地址的几种方法(详细简单)_怎么在终端查看本地ip地址

StarRocks实战避坑指南：从HTTP头缺失到分区表错误的5个常见问题解决

宝妈的需求“一仔播放器” WPF 的开源项目（四 唤起播放器，关闭应用、打开应用）_wpf 网络资源播放器

抖音的10个规则。AI大数据帮你解决_抖音平台规则详细

Python编程：深入探索进程优化技巧

DirectShow编程：从困惑到精通，从DX9.0到Flash Player的进阶之路

最新文章

一文教会你AIX系统备份：mksysb实用指南

SWF文件备份失败？这些步骤让你轻松搞定

Win10系统备份轻松搞定：掌握captureimage命令的关键技巧

Linux系统安全小贴士：掌握备份与恢复，安心每一天

省时省心！三步完成电脑系统高效备份！

Ubuntu系统维护秘籍：备份步骤详解，保护你的劳动成果！

Linux系统不哭：高效备份与快速恢复方案

Ubuntu系统安全大计，备份技巧大公开

GHOST教程：系统备份和还原，小白也能变成高手！

Linux备份与恢复必修课：SWF文件安全策略从入门到精通

Exploring the Finest Accommodations: A Comprehensive Guide to Ruston LA Hotels

The Enchanting Experience of ScaliniTella NYC: A Culinary Gem in the Heart of Manhattan

Exploring the Exquisite Aloft Chicago O'Hare: A Blend of Modern Luxury and Convenience

A Culinary Journey: Discovering the Finest Dining Experiences in Waco, TX

A Culinary Journey: Discovering the Finest Dining Experiences in Athens, GA

电脑设备管理器在哪里？一次让我抓狂又兴奋的寻找经历

与GWX的持久战：一段关于Windows10升级弹窗的私人记忆

以管理员身份运行：那些年我们追过的权限与踩过的坑

计算机必须设置默认打印机,电脑系统怎么默认打印机默认打印机的设置教程...

宝妈的需求“一仔播放器” WPF 的开源项目（四唤起播放器，关闭应用、打开应用）_wpf 网络资源播放器