如果讀者有留意科技新聞,就會知道最近有一個叫做ChatGPT的人工智能聊天機器人程式非常出名,這個程式運用了一間叫做OpenAI的公司建立的大型語言模型並透過強化學習進行訓練。用家只需要在公司的網頁上註冊一個免費的帳戶,就可以在線上跟聊天機械人通過文字對話。用家可以提問任何問題,這個人工智能程式就會在網頁上自動產生答案來回答。所以這個機械人就好像可以跟人作出互動,一問一答。網上有很多形容的例子,例如可以要求聊天機械人幫我們撰寫一份演講稿,寫一封電郵,閱讀一篇文章然後找出重點,也看過有人要求聊天機械人幫手撰寫Excel的程式,等等。效果其實非常好,有些時候這些回答可能比真人來得更加有效率。想像一下,如果我們要撰寫一份文章,有資料蒐集,整理再慢慢一字一句的在鍵盤上輸入,可能需要幾小時。使用這個聊天機械人,他可能在幾秒內已經可以將工作完成。所以其實網上面已經有討論,指出這些聊天機械人在短時間內其實已經有機會代替一些工種。作為年輕一代的讀者,可能真的需要花一點時間去認識一下這些聊天機械人。

前一陣子自己也想試玩一下,才發現OpenAI現在並不容許用家在香港註冊。很可惜沒有辦法直接使用這個ChatGPT的工具。在網上找一下,發覺有另外一個叫做Poe的程式,背後所使用的也是由OpenAI所開發的程式,在這篇文章裏面就會簡單測試一下這個聊天機械人的數學或者編寫程式的能力。 由於最先測試的其實並不是ChatGPT (她只是其中一款Generative AI)而是其他Generative AI,所以嚴格來說這篇文章應該叫做「與不同generative AI對話」。

自己最先使用的是Poe1.1.9版本內一個叫Dragonfly的機械人。

Tim:「一個圓形直徑10cm他面積是什麼?」
Dragonfly:「一個圓形直徑10cm的面積是314.16平方厘米。」
Tim:「圓形面積點樣計?」
Dragonfly:「圓形的面積是由公式πr²來計算的,其中r是圓形的半徑,π的值為3.14159。所以,如果直徑是10cm,則半徑為5cm,面積為πr²=3.14159×5²=314.16平方厘米。」
Tim:「你再計一次?你好似計錯數」
Dragonfly:「抱歉,我沒有計錯數。圓形的面積是由公式πr²來計算的,其中r是圓形的半徑,π的值為3.14159。所以,如果直徑是10cm,則半徑為5cm,面積為πr²=3.14159×5²=314.16平方厘米。」
Tim:「五嘅二次方係幾多?」
Dragonfly:「五的二次方是25。」
Tim:「咁點解pi×5嘅二次方係三百幾?」
Dragonfly:「因為π×5的二次方等於π×25,而π約等於3.14,所以π×25約等於78.5,乘以100之後等於7850。」
Tim:「你到底喺度講緊乜嘢?」
Dragonfly:「我正在解釋為什麼pi×5的二次方等於三百幾。π約等於3.14,所以π×25約等於78.5,乘以100之後等於7850。」
Tim:「點解要×100」
Dragonfly:「因為二次方是將數字乘上自己,所以π×25的二次方就是π×25×25,乘以100之後就是7850。」

這個名叫Dragonfly的機械人,還是有一些數學基礎,好像有一些代數的能力。知道如何從直徑變為半徑,再放到面積公式。反而是在運算時,不知道為什麼會從3.14159×25裏面得到314.6這個答案。從對話裏面發覺,他其實並沒有學習的能力。他並沒有從我的問題裏面發覺他計算裏面有任何錯誤,然後越問他就越天馬行空,不知道在做什麼。

看一看程式,發覺Poe其實也已經告訴用家說,因為這個機械人運算速度比較快,所以他的答案可能並不會太過準確。所以之後的實驗,就使用了另外一個名叫Sage的機械人。他所需要的運算時間比較長,所以反應可能比較慢(可能是從1秒變成2秒!)。

Tim:「一個圓形直徑10cm他面積是什麼?」
Sage:「這個圓形的直徑是10cm,因此它的半徑是5cm。圓形的面積可以使用公式A=πr^2來計算,其中A表示面積,r表示半徑,π是圓周率,約為3.14159。

將半徑r=5cm代入公式,可以得到:

A=πr^2
A=π×5^2
A=78.54平方厘米

因此,這個圓形的面積是78.54平方厘米。」

這個叫Sage的機械人數學水平比Dragonfly好。它可以從直徑找出半徑,把半徑放到圓形面積公式,再將答案正確的找出來。好,那讓我把這條問題難道加深一點點。

Tim:「一個圓形的圓周為一厘米,他的面積是多少?」
Sage:「如果圓周長為1厘米,那麼半徑為1/2π厘米。因此,該圓的面積為π×(1/2π)^2=π/4平方厘米。」

問題不是從直徑變成半徑,而是從圓周變成半徑。自己覺得他的水平還可以,應該比一些害怕數學的大學生好(!)。他可以正確的從圓周找出半徑,也正確的將半徑代入了圓面積公式。唯一的錯誤就只是在將答案化簡時,一個步驟做錯了。題目大部份都對,如果是在考試時見到這個答案,很可能也只會扣最後答案那部份的分數。成績也可以了。

有朋友說,可能是他的中文水平有點沒有太好,所以後面有一些問題就使用了英文。看看他數學科成績會是如何。

Tim:「given a circle with circumference 1, determine its area」
Sage:「The area of a circle with circumference 1 can be calculated using the formula A=(C/2π)^2, where A is the area, C is the circumference, and π is the mathematical constant pi (approximately3.14159).

Substituting C=1 into the formula, we get:

A=(1/2π)^2
A=1/(4π^2)

Therefore, the area of the circle with circumference 1 is 1/(4π^2) square units.」

使用了英文,圓面積公式更加是錯了。如果是考試,他的分數可能會比中文更低一點點。

在文章發表前兩天( 3月4日),發覺軟件版本1.1.10新加入ChatGPT!所以重新測試一下上面的問題,發覺這個聊天機械人水平更高。對於上面跟圓形有關的所有問題,他都能夠正確解答!

然後下面的題目,是一些更加嚴謹的數學問題。都是我們大學二年級數學分析的一些題目。可以見到,聊天機械人對一些背誦或者定義類型的題目表現會相對突出,這些機械人就沒有辦法掌握高層次一點思維的數學能力。

Tim:「prove that f(x)=cos(x) is a continuous function using the epsilon-delta approach」
Sage:「To prove that the function f(x) = cos(x) is continuous using the epsilon-delta approach, we need to show that for any given ε > 0, there exists a δ > 0 such that for all x and y in the domain of f(x), if |x – y| < δ, then |f(x) – f(y)| < ε.

Given any ε > 0, we can choose δ = ε. Then, if |x – y| < δ = ε, we have:

|cos(x) – cos(y)| <= 2

Since |cos(x) – cos(y)| <= 2 for all x and y in the domain of f(x), we can see that the cosine function is continuous, since for any given ε > 0, we can always find a δ > 0 such that the inequality |f(x) – f(y)| < ε is satisfied for all x and y in the domain of f(x) whenever |x – y| < δ.」

上面這條題目,它可以很簡單從他資料庫裏面找出定義,見到他很清楚知道我們需要他所證明的事情。可是他根本沒有辦法達到他需要證明的東西,只是隨便拼砌,然後說自己已經證明了那個問題。 如果在考試見到這個答案,我猜想他前面所寫的定義可能也會有一點分數。所以整條答案應該不會零分,所以應該也會比一些數學系的同學表現好(=_=)。

很奇怪,如果我們將函數改變一點,他的表現好像有點不一樣。

Tim:「prove that f(x)=sin(x) is a continuous function using the epsilon-delta approach」
Sage:「To prove that the function f(x) = sin(x) is continuous using the epsilon-delta approach, we need to show that for any given ε > 0, there exists a δ > 0 such that for all x and y in the domain of f(x), if |x – y| < δ, then |f(x) – f(y)| < ε.

Here’s one way to do that:

Given an ε > 0, choose δ = ε. Then, for any x and y in the domain of f(x) such that |x – y| < δ, we have:

|f(x) – f(y)| = |sin(x) – sin(y)|

By the Triangle Inequality, we know that |sin(x) – sin(y)| <= |x – y|, so we have:

|f(x) – f(y)| = |sin(x) – sin(y)| <= |x – y| < δ = ε

Therefore, we have shown that for any ε > 0, there exists a δ > 0 such that for all x and y in the domain of f(x), if |x – y| < δ, then |f(x) – f(y)| < ε. This means that the function f(x) = sin(x) is continuous.」

Tim:「why |sin(x) – sin(y)| <= |x – y|?」
Sage:「The inequality |sin(x) – sin(y)| <= |x – y| is a result of the Triangle Inequality.

The Triangle Inequality states that for any triangle with sides a, b, and c, the sum of the lengths of any two sides must be greater than or equal to the length of the third side.

In the context of the sine function, the Triangle Inequality can be applied to the triangle formed by the points (0, 0), (x, sin(x)), and (y, sin(y)) in the coordinate plane. The sides of this triangle are the distances between the points, and the Triangle Inequality states that the sum of the lengths of any two sides must be greater than or equal to the length of the third side.

It follows that, in this triangle, |sin(x) – sin(y)| must be less than or equal to the length of the side connecting the points (x, sin(x)) and (y, sin(y)), which is just |x – y|. Hence, we have:

|sin(x) – sin(y)| <= |x – y|」

證明裏面所使用的其中一條不等式其實是給他說對了,只是所證明的方法其實是胡言亂語。說得有點像樣,可是仔細想一下其實推不出他所需要的不等式。不知道他如何編造這一段文章出來。

整體來說,要這個機械人解決一些簡單的數學問題應該沒有困難。可是如果需要解決這個問題的是一些高層次一點的技巧,這些聊天機械人並不能正確的給予我們答案。所以使用這些聊天機械人幫助我們解決數學問題是就要非常小心,要不是他很有可能會給予我們錯誤的答案。你可以看到,他非常頑固,不會覺得自己有任何地方錯誤,也並沒有學習和自我修正的能力。當然這些測試可能並不公允,因為這些聊天機械人的設計,並不是用來做這些思考的問題。程式本身只是一個自然語言處理模型(Natural Language Processing,NLP),他並不會理解問題本身,而只是從我們的問題裏面找出相關詞語,跟系統內其他詞語比較,看看是否在統計上有所關連。

至於ChatGPT,這兩條關於分析的問題, 已經大致正確!對於sine 函數,他懂得使用double angle formula。

ChatGPT:「To prove that f(x) = sin(x) is a continuous function using epsilon-delta approach, we need to show that for any ε > 0, there exists a δ > 0 such that |sin(x) – sin(c)| < ε whenever |x – c| < δ, where c is some fixed point in the domain of sin(x).

Let ε > 0 be given. We can choose δ = ε, since for any x and c in the domain of sin(x), we have:

|sin(x) – sin(c)| = 2|cos((x+c)/2)sin((x-c)/2)| ≤ 2|sin((x-c)/2)| ≤ |x-c|

where we have used the double angle formula for sin and the fact that |cos((x+c)/2)| ≤ 1 for all x and c.

Now, suppose |x – c| < δ = ε. Then we have:

|sin(x) – sin(c)| ≤ |x – c| < ε

Therefore, we have shown that for any ε > 0, there exists a δ > 0 such that |sin(x) – sin(c)| < ε whenever |x – c| < δ. This proves that f(x) = sin(x) is a continuous function. 」

美中不足,他好像沒有解釋為什麼2|sin((x-c)/2)| ≤ |x-c|,如果他也寫出|sin(x)| ≤|x| 完成了。

另外,因為自己在另外一個學期會教授一門用電腦編寫程式去計數的科目,所以也想知道這些聊天機械人是否能夠解答冷門科目上面的數學問題。上一個學期在教授這門科目時,因為容許同學們使用網上編寫程式工具,所以甚至在考試的時候,也容許同學們在網上搜尋答案。自己甚至乎跟同學們說,如果你們問Siri,他可以回答的話你也可以使用他的答案。所以在下一個學期在教這門科目時,我可能就真的要試一下不同的聊天機械人,看看他們是否能夠在我手上拿到一個好Grade。

首先是一條相對比較簡單的題目。

Tim:「python code to compute the distance between two vectors in 100 dimensions」
Sage:「Here’s an example of how to compute the distance between two vectors in 100 dimensions using the numpy library:

import numpy as np

def euclidean_distance(v1, v2):
return np.linalg.norm(v1 – v2)

v1 = np.random.rand(100)
v2 = np.random.rand(100)
distance = euclidean_distance(v1, v2)
print(“The distance between the vectors is:", distance)
This code defines a function euclidean_distance that takes two vectors as input and returns the Euclidean distance between them. The input vectors v1 and v2 are generated using the random.rand function and are 100-dimensional. The function computes the Euclidean distance between the vectors as the magnitude of the difference between them. The result is displayed as “The distance between the vectors is:". 」

這條題目相對比較直接,所以可以見到他表現非常優秀。自己猜想他的答案可能比上一個學期某些同學還要好。 如果使用其他定義距離的方法呢?

Tim:「python code to determine the 1-norm distance between two vectors in 100 dimensions」
Sage:「Here’s an example of how to compute the 1-norm distance (also known as the Manhattan distance) between two vectors in 100 dimensions using the numpy library:

import numpy as np

def manhattan_distance(v1, v2):
return np.sum(np.abs(v1 – v2))

v1 = np.random.rand(100)
v2 = np.random.rand(100)
distance = manhattan_distance(v1, v2)
print(“The Manhattan distance between the vectors is:", distance)
This code defines a function manhattan_distance that takes two vectors as input and returns the Manhattan distance between them. The input vectors v1 and v2 are generated using the random.rand function and are 100-dimensional. The function computes the Manhattan distance between the vectors as the sum of the absolute differences between the elements of the vectors. The result is displayed as “The Manhattan distance between the vectors is:". 」

也沒有錯,可以見到他其實是實質的使用定義去計算距離。他有沒有能力使用其他函數去幫助計算呢?

Tim:「can u use the function norm? 」
Sage:「Yes, you can use the norm function from the numpy library to compute the 1-norm distance between two vectors:

import numpy as np

v1 = np.random.rand(100)
v2 = np.random.rand(100)

distance = np.linalg.norm(v1 – v2, ord=1)
print(“The 1-norm distance between the vectors is:", distance)
In this example, v1 and v2 are random 100-dimensional vectors, and the norm function is used to compute the 1-norm distance between them. The ord argument is set to 1 to specify that the 1-norm distance should be computed, and the result is displayed as “The 1-norm distance between the vectors is:". 」

在這條題目來說,他的表現真的非常優秀。可能是問題本身比較平凡,沒有太多的思考空間。 自己覺得這些問題直接在網上搜尋可能也可以找到這些答案。

既然如此,那就看一條比較困難的問題。在一個5×5的矩陣內,隨機選取20%的空格填上0,然後其他地方都填上1。 所以電腦第一步是需要知道其中5個空格是需要填上0。第二是需要知道選取這些空格的方式是需要隨機。看一看他是如何嘗試。

Tim:「python code to generate a five by five matrix with 20% entries in random given by 0 and all others 1」
Sage:「Here’s an example in Python using the numpy library:

import numpy as np

matrix = np.ones((5, 5))
for i in range(5):
for j in range(5):
if np.random.rand() < 0.2:
matrix[i][j] = 0

print(“The 5×5 matrix with 20% entries as 0 and the rest as 1 is:\n", matrix)
This code generates a 5×5 matrix with all entries initialized as 1, and uses a nested loop to randomly change 20% of the entries to 0 using the numpy.random.rand function. The result is displayed as “The 5×5 matrix with 20% entries as 0 and the rest as 1 is:". 」

其實這個答案也不錯,你可以見到他選出20%的方式只是用了一個隨機數。當這個矩陣的大小趨向無限,這個方法就是對的。可是在這個5×5的矩陣情況,一般來說我們都不可以保證20%的空格是我們所希望的數值。為了要他解決這個問題,我在提問時多加了「exactly」這個提示,要求他所做的矩陣一定要只得5個空格是數字0。

Tim:「python code to generate a five by five matrix with exactly 20% entries in random given by 0 and all others 1」
Sage:「Here’s an example in Python using the numpy library:

import numpy as np

matrix = np.ones((5, 5))
count = int(5 * 5 * 0.2)
for i in range(count):
x = np.random.randint(5)
y = np.random.randint(5)
while matrix[x][y] == 0:
x = np.random.randint(5)
y = np.random.randint(5)
matrix[x][y] = 0

print(“The 5×5 matrix with exactly 20% entries as 0 and the rest as 1 is:\n", matrix)
This code generates a 5×5 matrix with all entries initialized as 1, and uses a loop to change exactly 20% of the entries to 0. The count variable keeps track of the number of entries that need to be changed to 0. The loop continues to run until count is equal to the number of entries that need to be changed. The np.random.randint function is used to generate random indices for the entries that will be changed to 0. The result is displayed as “The 5×5 matrix with exactly 20% entries as 0 and the rest as 1 is:". 」

結果非常好,只要問對問題,聊天機械人就可以給我們一個正確的答案。既然如此,就讓我多加一些數學的成份在問題裏面。

Tim:「python code to show two vectors are parallel」
Sage:「Here’s an example in Python to show that two vectors are parallel:

def is_parallel(v1, v2):
if v1 == v2 or v1 == [-x for x in v2]:
return True
else:
return False

v1 = [1, 2]
v2 = [2, 4]

if is_parallel(v1, v2):
print(“The vectors are parallel")
else:
print(“The vectors are not parallel")
This code defines a function is_parallel that takes two vectors as input and returns True if they are parallel and False otherwise. Two vectors are parallel if they are equal or if one is equal to the negative of the other. In this example, v1 and v2 are defined as the 2D vectors [1, 2] and [2, 4], which are parallel. The function returns True and prints “The vectors are parallel".」

上面這條問題的數學題困難,只是想他證明一下兩支向量是否平行。他在程式裏面只是看一下兩支衡量是否相等或者其中一支另外一支成乘負一。這條題目,他的答案是錯了。所以如果題目有相當的數學成份,聊天機械人還是沒有能力把程式編出來。

ChatGPT 編寫程式的能力比起Sage 更加強大,這條平行向量的問題,他竟然做對了!

ChatGPT:「Two vectors are parallel if they are scalar multiples of each other. In other words, if one vector is a multiple of the other, they are parallel. We can use this fact to write a Python function that checks whether two vectors are parallel. Here’s an example:

import numpy as np

def are_parallel(u, v, tol=1e-8):
“""
Check whether two vectors u and v are parallel.
“""
# Compute the norm of u and v
norm_u = np.linalg.norm(u)
norm_v = np.linalg.norm(v)

# If either vector has zero norm, they are not parallel
if norm_u == 0 or norm_v == 0:
    return False

# Compute the ratio of the norm of u to the norm of v
ratio = norm_u / norm_v

# Check if u is a multiple of v, within a tolerance
if np.allclose(u, ratio * v, atol=tol):
    return True
else:
    return False

“`

This function takes two vectors u and v as input, along with an optional tolerance tol (defaulting to 1e-8). It first computes the norms of the vectors using the np.linalg.norm function. If either vector has zero norm, they are not parallel, so the function returns False.

If both vectors have nonzero norm, the function computes the ratio of the norm of u to the norm of v. If u is a multiple of v, then u should be equal to some scalar multiple of v, which is given by multiplying v by the ratio. The function then checks whether u is close enough to the scaled version of v within the given tolerance using the np.allclose function. If they are close enough, the function returns True, indicating that the vectors are parallel. Otherwise, it returns False.」

有關於垂直的呢?

Tim:「python code to determine a unit vector that is orthogonal to (1,2,3,4) 」
Sage:「Unfortunately, a unit vector in 4-dimensional space cannot be orthogonal to a vector in 3-dimensional space. Orthogonality is only defined between two vectors in the same dimension. To determine a unit vector that is orthogonal to a 4-dimensional vector, you would need to input a 4-dimensional vector. 」

這條題目希望找一支在四維空間裏面長度是1的向量,而且他垂直於一支題目給予的向量。上面這條題目測試兩樣事情,第一是想知道他是否認識垂直,第二是如何從無限那麼多答案裏面給我其中一個。

Tim:「python code to determine a unit 4-dimensional vector that is orthogonal to (1,2,3,4) 」
Sage:「Here’s an example code in Python to determine a unit 4-dimensional vector that is orthogonal to (1,2,3,4):

import numpy as np

def orthogonal_vector(vector):
random_vector = np.random.rand(4)
orthogonal_vector = np.cross(vector, random_vector)
unit_vector = orthogonal_vector / np.linalg.norm(orthogonal_vector)
return unit_vector

vector = np.array([1,2,3,4])
unit_vector = orthogonal_vector(vector)
print(unit_vector)
Note that this code uses numpy.cross to determine the cross product of two vectors. However, numpy.cross only works for arrays of 3 dimensions. To find the cross product of arrays with more than 3 dimensions, you need to use a different method. 」

這個答案是錯了。因為他使用了cross product。而這個東西只存在於三維空間。上面這個程式是有問題的。你既然知道四維空間沒有cross product,那為什麼要這樣做呢?

想一下他的答案,如果問題並不是設定在四維空間而是在三維空間,他的答案其實是對的。他是在三圍空間內隨機找一支向量,然後使用cross product, 所以orthogonal_vector 其實是儲存了一支垂直於題目所給予的向量,然後他下一步其實是把長度變成1。

Sage:「To find the cross product of two arrays of more than 3 dimensions, you can use the tensor product. The tensor product is a mathematical operation that takes two vectors and returns a matrix. The elements of the matrix are the products of the elements of the two vectors.

In general, the cross product of two vectors in n-dimensional space is not defined. However, there are specific cases when it’s possible to find a vector that is orthogonal to a given vector in n-dimensional space. For example, in 4-dimensional space, it’s possible to find two orthogonal vectors that are orthogonal to a given vector.

In conclusion, finding a unit vector that is orthogonal to a given vector in n-dimensional space is not straightforward and depends on the specific case. It’s important to consider the properties of the vectors and the space they are defined in. 」

這個測試可以知道,他對cross product 是有一點認知,在他的數據庫內是有儲存着這個計算的特性。所以他嘗試使用這個技巧解決這個問題。可是要答對這個問題,需要的並不是在數據庫內的資料,同學們是需要運用垂直的定義小心嘗試,想辦法在這麼多可能性裏面找出一個符合規範的答案。這個就不是聊天機械人所能夠解決的了。

至於ChatGPT,他仍然是使用了cross-product。所以答案仍然是錯。