-- Leo's gemini proxy

-- Connecting to soviet.circumlunar.space:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini; lang=en

[spec] The Tragedy of &


From: Gary Johnson

Date: Wed, 27 Jan 2021 17:19:59 -0500


Jason McBrayer wrote:


> Having more complex forms is a temptation to implement applications on

> Gemini, rather than using pairings of protocol+client that are more

> appropriate (e.g. using NNTP for a message board).


Charlie Stanton <charlie at shtanton.com> wrote:


> I agree with this completely. I think Gemini should be a protocol for

> viewing content only. I missed all the discussion around inimeg, titan

> etc. at the time but I feel similarly about those.

>

> I think a different protocol for filling out forms makes a lot more

> sense, and we can work on having gemini clients and form clients play

> nicely together so the user experience doesn't suffer from using a

> different program to fill out a form.

>

> Adding forms would take us wayyyyy too close to the web in my opinion.


And now me...


tl;dr: Gemini can already emulate forms. We just need a spec language

clarification in Section 3.2.1 1x (INPUT) from Solderpunk and for

client authors to update their software accordingly. I illustrate

both points (and provide code) below.



Section 1: Motivation



I appreciate the generally conservative nature of the Gemini community

when it comes to extending the Gemini and Gemtext specifications. As a

server author, this certainly keeps my life easier.


However, I'd like to go on record here to say that interactive capsules

are not something that worries me. There are already quite a few of them

out there in Geminispace (hello Astrobotany!), and I'd like to continue

to see this medium grow and thrive in our little corner of the internet.


I don't think form-like data submission should be seen as an evil. It

allows us to implement a wide variety of CGI-style applications that do

all their computing on the server side (often through some script

extension mechanism). This keeps our servers and clients simple,

empowers content authors to build cool things, and still keeps us nicely

insulated from "The Javascript Trap" since our Gemini clients never

download and run any client-side code.


The Javascript Trap



Section 2: The Problem



Over the months that I have followed this mailing list, I've seen

broadly two categories of proposals around extending Gemini's simple

input methods:


1. Ways to submit multiple pieces of information to a server at once.


2. Ways to upload files to a server.


Both proposals are pretty self-explanatory since they extend the

possible functionality of interactive Gemini capsules without breaking

any of our privacy or security guarantees. However, option 1 puts an

additional burden on client authors, and option 2 puts an additional

burden on both client and server authors.


Some members of our community have suggested that these features aren't

worth the extra effort. Others have argued in favor of one or both of

them, and a brave few have gone off and created their own sister

protocols to try and implement Gemini-like systems that also support

some variant of these two data upload options (e.g., Titan, Dioscuri,

Inimeg).


>From a personal standpoint (and I can only speak for myself here

obviously), I wouldn't mind one or more form types being added to

Gemtext (option 1 above) as it would reduce the total number of

round-trip network requests between client and server to submit multiple

pieces of information (and I have quite a slow satellite internet

connection, so this matters to me).



Section 3: A Solution



However, even without (a very unlikely) form enhancement to Solderpunk's

Gemtext spec, I'd like to remind folks that we actually do (or at least

we should) already have the ability to emulate forms in our Gemini

capsules.



Section 3.1: Form Templates



Assuming we are currently browsing a page at

gemini://awesome.capsule.net/form, this dynamic Gemtext page could

include forms as follows:


# Welcome to my Gemini Form!

To fill in any field below, simply click it. Everything's a link in Gemini, so you can't really mess up!

=> form?$SESSION&name Name: $NAME

=> form?$SESSION&password Password: $PASSWORD

=> form?$SESSION&smog SMOG is great: $SMOG

=> form?$SESSION&plant Best Astrobotany Plant: $PLANT

=> form?$SESSION&submit Submit Answers

Here, my Gemtext is a template string, which I process in a context in

which $SESSION, $NAME, $PASSWORD, $SMOG, and $PLANT are defined (or

default to empty strings). When the page first loads, we create a new

$SESSION value in our CGI script and insert it into the links to

preserve state across requests until we restart the server or the user

refreshes the page.


(Obviously, a more robust state management mechanism could be achieved

with client certs and a DB, but I just mean to show a very simple

example here.)



Section 3.2: Server-side Responses



Here would be the server-side responses for each of those links:


NAME: 10 Enter your name\r\n

PASSWORD: 11 Enter password\r\n

SMOG: 10 Choose one of [Yes|No]\r\n

PLANT: 10 Choose one of [Ficus|Baobob|Pachypodium|Moss]\r\n


For the boolean choice (SMOG) and the multiple choice (PLANT) inputs,

you could, of course, perform input validation and re-prompt if

necessary. You could also simply include one link per choice in your

form template instead of using a 10 INPUT response.



Section 3.3: (DESIRED) Client-side Requests



The intention of this example is that the clients would produce requests

of this form after each input prompt:


gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson

gemini://awesome.capsule.net/form?$SESSION&password&secret

gemini://awesome.capsule.net/form?$SESSION&smog&yes

gemini://awesome.capsule.net/form?$SESSION&plant&Ficus


where $SESSION is whatever value was generated by the CGI script on the

first page load.



Section 3.4: Server-side State Management and Form Submission



With this information in the query params, it would be easy to store a

lookup table in the CGI script that mapped session -> field -> value,

and these values can then be easily inserted into the original Gemtext

template form above (see Section 3.1) in response to these requests.


The form?$SESSION&submit link can then trigger the server to validate

that all of the required form fields have been filled in correctly and

perform whatever next step operation you want.



Section 3.5: File "Uploads"



In addition, as I mentioned several months ago on this list, you could

perform file "uploads" by having one of the input links prompt for a URL

to a file. Then the server could download that file and store it in your

session (or account if you're using client certs and a DB).



Section 4: What's Stopping This from Working?



While this example creates more back-and-forth requests than a proper

client-side form would generate, I hope it demonstrates that Gemini and

Gemtext in their current incarnations are already sufficiently complete

to build interactive CGI applications with them today.


The only problem I'm running into here is that the various Gemini

clients I've tested (elpher, bombadillo, kristall) don't actually append

a user's input as an additional parameter to an existing query string if

one is present. Instead, bombadillo and kristall just overwrite the

existing query string and only return ?$NEW_INPUT. Elpher, on the other

hand, just creates invalid URLs by simply appending ?$NEW_INPUT to

whatever is already in the URL (e.g.,

gemini://awesome.capsule.net/form?$SESSION&smog?yes. Neither of these

behaviors do what I'd want or expect here.



Section 4.1: Check the Spec!



I think the culprit then is probably Gemini Protocol Specification

section 3.2.1 1x (INPUT):


Status codes beginning with 1 are INPUT status codes, meaning:

The requested resource accepts a line of textual user input. The <META>
line is a prompt which should be displayed to the user. The same
resource should then be requested again with the user's input included
as a query component. Queries are included in requests as per the usual
generic URL definition in RFC3986, i.e. separated from the path by a ?.
Reserved characters used in the user's input must be "percent-encoded"
as per RFC3986, and space characters should also be percent-encoded.


Section 4.2: Append Don't Replace!



As far as I can tell, the fix here is for Solderpunk to update the text

in section 3.2.1 to indicate that if a query string is already part of

the request leading to an INPUT response, then the user's input should

be appended (using &) to the existing query string rather than replacing

it wholesale (using ?).


Otherwise, we really have no way to input more than one query param

(with &) other than asking the user to type it directly into the INPUT

prompt (e.g., cat&dog&pig). I'm hoping this isn't the spec's intention

here and that we just have a case of ambiguous wording that has led some

client authors to create divergent (or broken) implementations.



Section 5: Conclusion and a Call to Action



Okay, that was a LONG message, but I hope I've communicated my points

clearly. Thanks to all who read this far, and thanks to everyone for

making Gemini such an active and engaging community!


I've attached a short (47 line) CGI script (for Space Age) that

implements the dynamic form example described in this email. If clients

would append user input params (with &) to existing query strings rather

than replace them, it should work perfectly. Until then, it will just

have to feel a bit sad and dejected.


Whose client is going to make it work first! I wait eagerly with bated

breath to find out.


Happy hacking!

Gary


-------------- next part --------------

An embedded and charset-unspecified text was scrubbed...

Name: form.clj

URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210127/4d28aea6/attachment.ksh>

-------------- next part --------------


--

GPG Key ID: 7BC158ED

Use `gpg --search-keys lambdatronic' to find me

Protect yourself from surveillance: https://emailselfdefense.fsf.org

=======================================================================

() ascii ribbon campaign - against html e-mail

/\ www.asciiribbon.org - against proprietary attachments


Why is HTML email a security nightmare? See https://useplaintext.email/


Please avoid sending me MS-Office attachments.

See http://www.gnu.org/philosophy/no-word-attachments.html


--------

From: Katarina Eriksson

Date: Fri, 29 Jan 2021 13:05:31 +0100


Gary Johnson <lambdatronic at disroot.org> wrote


> => form?$SESSION&name Name: $NAME

>

> => form?$SESSION&password Password: $PASSWORD

>

> => form?$SESSION&smog SMOG is great: $SMOG

>

> => form?$SESSION&plant Best Astrobotany Plant: $PLANT

>

> => form?$SESSION&submit Submit Answers

>


[...]


(Obviously, a more robust state management mechanism could be achieved

> with client certs and a DB, but I just mean to show a very simple

> example here.)

>


Yes, if the client supports client certificates, we can skip sending

$SESSION and use the regular inputs:


=> gemini://awesome.capsule.net/form/name Name
=> gemini://awesome.capsule.net/form/password Password
=> gemini://awesome.capsule.net/form/smog SMOG is great
=> gemini://awesome.capsule.net/form/plant Best Astrobotany Plant
=> gemini://awesome.capsule.net/form/submit Submit Answers

[...]


Section 3.3: (DESIRED) Client-side Requests

>

>

> The intention of this example is that the clients would produce requests

> of this form after each input prompt:

>

> => gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson

> => gemini://awesome.capsule.net/form?$SESSION&password&secret

> => gemini://awesome.capsule.net/form?$SESSION&smog&yes

> => gemini://awesome.capsule.net/form?$SESSION&plant&Ficus

>

> where $SESSION is whatever value was generated by the CGI script on the

> first page load.

>


I do not understand this example.


When using regular inputs, the client will send these requests:


gemini://awesome.capsule.net/form/name?Gary%Johnson

gemini://awesome.capsule.net/form/password?secret

gemini://awesome.capsule.net/form/smog?yes

gemini://awesome.capsule.net/form/plant?Ficus

gemini://awesome.capsule.net/form/submit


(No "?" on "submit" since it's just telling the server that we're done.)


What is the benefit of doing it your way?


Section 3.4: Server-side State Management and Form Submission

>

>

> With this information in the query params, it would be easy to store a

> lookup table in the CGI script that mapped session -> field -> value,

> and these values can then be easily inserted into the original Gemtext

> template form above (see Section 3.1) in response to these requests.

>


If you format the URLs like this:


gemini://$HOST/path/to/script/$FIELD?$VALUE


...then $FIELD should show up as PATH_INFO (probably with a leading "/")

and $VALUE as QUERY_STRING.


[...]


The only problem I'm running into here is that the various Gemini


clients I've tested (elpher, bombadillo, kristall) don't actually append

> a user's input as an additional parameter to an existing query string if

> one is present. Instead, bombadillo and kristall just overwrite the

> existing query string and only return ?$NEW_INPUT. Elpher, on the other

> hand, just creates invalid URLs by simply appending ?$NEW_INPUT to

> whatever is already in the URL (e.g.,

> gemini://awesome.capsule.net/form?$SESSION&smog?yes. Neither of these

> behaviors do what I'd want or expect here.

>


Elpher is doing something weird here but the others are handling inputs as

intended.


Section 4.1: Check the Spec!

>

>

> I think the culprit then is probably Gemini Protocol Specification

> section 3.2.1 1x (INPUT):

>


[...]


>

Section 4.2: Append Don't Replace!

>

>

> As far as I can tell, the fix here is for Solderpunk to update the text

> in section 3.2.1 to indicate that if a query string is already part of

> the request leading to an INPUT response, then the user's input should

> be appended (using &) to the existing query string rather than replacing

> it wholesale (using ?).

>


This is not a necessary spec change.


Otherwise, we really have no way to input more than one query param

> (with &) other than asking the user to type it directly into the INPUT

> prompt (e.g., cat&dog&pig).



The responsibility for collecting parameters fall on the server, not on the

client. The only thing the client needs to do is sending one query for each

field.


I'm hoping this isn't the spec's intention

> here and that we just have a case of ambiguous wording that has led some

> client authors to create divergent (or broken) implementations

>


Sorry to disappoint you. I suggest leaving the ampersands to the web

queries.


[...]


I've attached a short (47 line) CGI script (for Space Age) that

> implements the dynamic form example described in this email.

>


Thank you for providing example code and I'm sorry for not doing the same.


--

Katarina


>

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210129/c5136798/attachment.htm>


--------

From: Gary Johnson

Date: Sat, 30 Jan 2021 15:54:39 -0500


> ## Section 3.3: (DESIRED) Client-side Requests

>>

>>

>> The intention of this example is that the clients would produce requests

>> of this form after each input prompt:

>>

>> => gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson

>> => gemini://awesome.capsule.net/form?$SESSION&password&secret

>> => gemini://awesome.capsule.net/form?$SESSION&smog&yes

>> => gemini://awesome.capsule.net/form?$SESSION&plant&Ficus

>>

>> where $SESSION is whatever value was generated by the CGI script on the

>> first page load.

>>

>

> I do not understand this example.

>

> When using regular inputs, the client will send these requests:

>

> gemini://awesome.capsule.net/form/name?Gary%Johnson

> gemini://awesome.capsule.net/form/password?secret

> gemini://awesome.capsule.net/form/smog?yes

> gemini://awesome.capsule.net/form/plant?Ficus

> gemini://awesome.capsule.net/form/submit

>

> (No "?" on "submit" since it's just telling the server that we're done.)

>

> What is the benefit of doing it your way?


Hi Katarina,


Thanks for taking the time to reply to my message. I'll try to clarify

my point here.


The issue I'm raising is that there appears to be no way to pass more

than one piece of information at a time in our query strings. This has a

very significant impact on any writers of CGI scripts, which is how many

Gemini servers allow users to add dynamic pages to their capsules.


But why, you ask?


Because each CGI script is available at a particular file path and

therefore additional path segments can't be used to pass information to

them. They have to get their inputs from the query string.


This is a script. It probably returns a 20 response:


gemini://awesome.capsule.net/form.clj


If I want to fill in a name field on that page, I might provide a link

like this:


gemini://awesome.capsule.net/form.clj?name


This calls the CGI script with a query parameter. Great! The script can

use "name" to look up the appropriate response. Here it is:


10 Please enter your name\r\n


However, when the user fills in their name, the browser will now send

this request to the server:


gemini://awesome.capsule.net/form.clj?Gary%20Johnson


There is no way for the CGI script to know that this is a name value and

not the value for any other form field on the page.


And therein lies the rub. If the only way to associate input values with

the variables they represent is with path segments, then CGI scripts

simply can't ever use more than one input field per page. Even then, if

the query string used to trigger a 10 INPUT response is typed by the

user (into the totally free form text field they are presented), then

the server will continue to respond with yet another 10 INPUT response.


This would make a form with N fields require N+1 separate CGI scripts,

all chained together via links that represent the directory structure

into which they are installed.


This is an absolute nightmare scenario for programming anything that

wants to accept user inputs.


So what does this mean for Geminispace?


It means essentially that CGI scripts are currently second-class

citizens, and the only people who can write dynamic capsules are server

authors (or people willing to hack on server code). This is because

encoding information using path segments requires injecting custom

routing table code into the server's request handler.


As a server author, I am capable of creating a custom fork of my server

with a new routing table for each dynamic capsule I want to build.

However, I suspect the majority of Gemini users are not going to have

both the skill and willingness to engage in this level of coding on

their pages.


That is why I and many other authors have added support for CGI scripts

to our servers. But under the "only one piece of information in the

query string" paradigm, these scripts are currently rather handicapped

when it comes to accepting user input.


Hopefully, I've made the technical merits of my case clear here.



> ## Section 4.2: Append Don't Replace!

>>

>>

>> As far as I can tell, the fix here is for Solderpunk to update the text

>> in section 3.2.1 to indicate that if a query string is already part of

>> the request leading to an INPUT response, then the user's input should

>> be appended (using &) to the existing query string rather than replacing

>> it wholesale (using ?).

>>

>

> This is not a necessary spec change.



Yes, it really is if anyone other than server authors is ever going to

be able to write their own dynamic pages.



> Otherwise, we really have no way to input more than one query param

>> (with &) other than asking the user to type it directly into the INPUT

>> prompt (e.g., cat&dog&pig).

>

>

> The responsibility for collecting parameters fall on the server, not on the

> client. The only thing the client needs to do is sending one query for each

> field.



Again, see above. A single query value cannot be associated with its

variable without adding a custom routing table to the server to enable

the parsing of path segment data as additional inputs.



> I'm hoping this isn't the spec's intention

>> here and that we just have a case of ambiguous wording that has led some

>> client authors to create divergent (or broken) implementations

>>

>

> Sorry to disappoint you. I suggest leaving the ampersands to the web

> queries.



I'm afraid we disagree here.



> Thank you for providing example code and I'm sorry for not doing the same.



If you can write a CGI script that can correctly associate INPUT

responses with their intended variables, please share it. I suspect it

would be quite educational.


Happy hacking,

Gary


--

GPG Key ID: 7BC158ED

Use `gpg --search-keys lambdatronic' to find me

Protect yourself from surveillance: https://emailselfdefense.fsf.org

=======================================================================

() ascii ribbon campaign - against html e-mail

/\ www.asciiribbon.org - against proprietary attachments


Why is HTML email a security nightmare? See https://useplaintext.email/


Please avoid sending me MS-Office attachments.

See http://www.gnu.org/philosophy/no-word-attachments.html


--------

From: Chris Babcock

Date: Sat, 30 Jan 2021 22:19:25 +0000


This is going to be weird, because I disagree with almost everything you've said except that appending the query string should be guaranteed. I hope this is helpful


January 30, 2021 1:54 PM, "Gary Johnson" <lambdatronic at disroot.org> wrote:


> The issue I'm raising is that there appears to be no way to pass more

> than one piece of information at a time in our query strings. This has a

> very significant impact on any writers of CGI scripts, which is how many

> Gemini servers allow users to add dynamic pages to their capsules.

>

> But why, you ask?

>

> Because each CGI script is available at a particular file path and

> therefore additional path segments can't be used to pass information to

> them. They have to get their inputs from the query string.


%< ------------------------------


> This would make a form with N fields require N+1 separate CGI scripts,

> all chained together via links that represent the directory structure

> into which they are installed.

>

> This is an absolute nightmare scenario for programming anything that

> wants to accept user inputs.


Well, you *could* pass extra path info to the script... So, the script at cgi-bin/index.cgi handles all cgi-bin/* and treats the path after cgi-bin as positional arguments


> So what does this mean for Geminispace?

>

> It means essentially that CGI scripts are currently second-class

> citizens, and the only people who can write dynamic capsules are server

> authors (or people willing to hack on server code). This is because

> encoding information using path segments requires injecting custom

> routing table code into the server's request handler.


CGI scripts *are* second class citizens in Gemini, but it's because the UX and dev-op experience of line based input is terrible. The fact that a static routing table is more performant and has a better security profile than parsing the path info dynamically is less relevant than the fact that this is a line based protocol


%< ------------------------------


>> ## Section 4.2: Append Don't Replace!

>>> As far as I can tell, the fix here is for Solderpunk to update the text

>>> in section 3.2.1 to indicate that if a query string is already part of

>>> the request leading to an INPUT response, then the user's input should

>>> be appended (using &) to the existing query string rather than replacing

>>> it wholesale (using ?).

>>

>> This is not a necessary spec change.

>

> Yes, it really is if anyone other than server authors is ever going to

> be able to write their own dynamic pages.

>


Now, "Append, don't replace," is a reasonable expectation to make of clients and it's still useful for the devops situation, even if it's not *strictly* necessary


%< ------------------------------


> If you can write a CGI script that can correctly associate INPUT

> responses with their intended variables, please share it. I suspect it

> would be quite educational.


The two alternatives to requiring clients to preserve collected state in the query parameter are to save state in the CGI script or to pass positional arguments via the path. I think append is reasonable. It also preserves principle of least surprise and other desirable qualities


CGI *is* going to be second class in Gemini as long as forms aren't an option, but that's a consequence of the decision to support line-based clients. Appending the query doesn't do violence to that design


Chris


--------

From: Sean Conner

Date: Sat, 30 Jan 2021 18:59:47 -0500


It was thus said that the Great Gary Johnson once stated:

>

> The issue I'm raising is that there appears to be no way to pass more

> than one piece of information at a time in our query strings. This has a

> very significant impact on any writers of CGI scripts, which is how many

> Gemini servers allow users to add dynamic pages to their capsules.

>

> But why, you ask?

>

> Because each CGI script is available at a particular file path and

> therefore additional path segments can't be used to pass information to

> them. They have to get their inputs from the query string.


[ snip ]


> It means essentially that CGI scripts are currently second-class

> citizens, and the only people who can write dynamic capsules are server

> authors (or people willing to hack on server code). This is because

> encoding information using path segments requires injecting custom

> routing table code into the server's request handler.


Not if the CGI interface is properly written. All I had to do was write

this CGI script and drop it into my tests directory [1]:


gemini://gemini.conman.org/test/pathseg.cgi


It uses only three of the RFC-3875 defined variables, QUERY_STRING,

SCRIPT_NAME and PATH_INFO to do all the work. The script will ask for three

fields and then present a final page with all three fields. But the script

will only work if all three variables are defined per RFC-3975 (PATH_INFO is

the tricky one).


Yes, it's a bit ugly and yes, it's a second class citizen and yes, it

requires a proper CGI module to work, but it can be done without the

configuration you think it does. The script just simply appends each input

field as the path, so if you enter


and a one

and a two

skidoosh


as the values, the final URL will be:


gemini://gemini.conman.org/test/pathseg.cgi/and%20a%20one/and%20a%20two/skidoosh


Yes, I could have done a bit more processing, naming each segment:


/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh


but I was lazy and wanted to just do a proof-of-concept here.


> If you can write a CGI script that can correctly associate INPUT

> responses with their intended variables, please share it. I suspect it

> would be quite educational.


I have added it [3].


-spc


[1] I gave the script a .cgi extension just to drive the point

home---for my server, GLV-1.12556 [2], the extension of a CGI script

doesn't matter at all.


[2] https://github.com/spc476/GLV-1.12556


[3] Here you go. It's in Lua, but it's easy going except for the first

bit which is a bit of broilerplate I needed for encoding and

decoding various strings. The main logic is marked though, so you

can skip the first section.


lua

-- ************************************************************************

-- Decoding and Encoding crap, not much to see here, citizen! Move along!

-- ************************************************************************


local lpeg = require "lpeg"

local xdigit = lpeg.locale().xdigit

local char = lpeg.P"%" * lpeg.C(xdigit * xdigit)

/ function(c)

return string.char(tonumber(c,16))

end

+ lpeg.P"+" / " "

+ lpeg.P(1)

local decode_query = lpeg.Cs(char^1)


local function tohex(c)

return string.format("%%%02X",string.byte(c))

end


local unsafe = lpeg.P" " / "%%20"

+ lpeg.P"#" / "%%23"

+ lpeg.P"%" / "%%25"

+ lpeg.P"<" / "%%3C"

+ lpeg.P">" / "%%3E"

+ lpeg.P"[" / "%%5B"

+ lpeg.P"\\" / "%%5C"

+ lpeg.P"]" / "%%5D"

+ lpeg.P"^" / "%%5E"

+ lpeg.P"{" / "%%7B"

+ lpeg.P"|" / "%%7C"

+ lpeg.P"}" / "%%7D"

+ lpeg.P'"' / "%%22"

+ lpeg.R("\0\31","\127\255") / tohex

local char_path = lpeg.P"?" / "%%3F"

+ unsafe

+ lpeg.P(1)

local esc_path = lpeg.Cs(char_path^0)


-- ************************************************************************

-- The main script starts here

-- ************************************************************************


local query = os.getenv("QUERY_STRING")

local script_name = os.getenv("SCRIPT_NAME")

local pathinfo = os.getenv("PATH_INFO")


if not pathinfo and query == "" then

io.stdout:write("Status: 10\n")

io.stdout:write("Content-Type: Input field\n")

io.stdout:write("\n")

os.exit(0,true)

end


if not pathinfo then

query = decode_query:match(query)

query = esc_path:match(query)

io.stdout:write("Status: 30\n")

io.stdout:write(string.format("Location: %s/%s\n",script_name,query))

io.stdout:write("\n")

os.exit(0,true)

end


if pathinfo:match("^/[^/]*/[^/]*/[^/]*") then

local f1,f2,f3 = pathinfo:match("^/([^/]*)/([^/]*)/([^/]*)")

f1 = decode_query:match(f1)

f2 = decode_query:match(f2)

f3 = decode_query:match(f3)


io.stdout:write("Status: 20\n")

io.stdout:write("Content-Type: text/gemini\n")

io.stdout:write("\n")

io.stdout:write("The three fields you input:\n")

io.stdout:write("\n")

io.stdout:write(string.format("* %s\n",f1))

io.stdout:write(string.format("* %s\n",f2))

io.stdout:write(string.format("* %s\n",f3))

io.stdout:write("\n")

io.stdout:write(string.format("=> %s Try again\n",script_name))

os.exit(0,true)

end


if query == "" then

io.stdout:write("Status: 10\n")

io.stdout:write("Content-Type: Input next field\n")

io.stdout:write("\n")

os.exit(0,true)

else

query = decode_query:match(query)

query = esc_path:match(query)

pathinfo = esc_path:match(pathinfo)

io.stdout:write("Status: 30\n")

io.stdout:write(string.format("Location: %s%s/%s\n",script_name,pathinfo,query))

io.stdout:write("\n")

os.exit(0,true)

end


--------

From: John Cowan

Date: Sat, 30 Jan 2021 22:09:55 -0500


On Wed, Jan 27, 2021 at 5:20 PM Gary Johnson <lambdatronic at disroot.org>

wrote:



> I don't think form-like data submission should be seen as an evil. It

> allows us to implement a wide variety of CGI-style applications that do

> all their computing on the server side (often through some script

> extension mechanism).

>


+1


which $SESSION, $NAME, $PASSWORD, $SMOG, and $PLANT are defined (or default

> to empty strings). When the page first loads, we create a new

> $SESSION value in our CGI script and insert it into the links to

> preserve state across requests until we restart the server or the user

> refreshes the page.

>


I think this is exactly the Right Thing.



> (Obviously, a more robust state management mechanism could be achieved

> with client certs and a DB, but I just mean to show a very simple

> example here.)

>


A TLS session is not the same as an application session. I may, for

example, have two tabs (or whatever) open in my Gemini browser that refer

to the same access-controlled capsule, and which therefore must be accessed

with the same cert. Nevertheless, the two pages should operate as distinct

sessions: I should be able to fill out a form in one page while searching

help documents in the other. So I think a session ID is the Right Thing.

However, this is a matter of server/capsule/CGI design, not of the Gemini

protocol.


While this example creates more back-and-forth requests than a proper

>

client-side form would generate, I hope it demonstrates that Gemini and

> Gemtext in their current incarnations are already sufficiently complete

> to build interactive CGI applications with them today.

>


The biggest problem is most likely the cost of setting up and tearing down

all the TLS connections, but there is no help for that.


> The requested resource accepts a line of textual user input. The <META>

> line is a prompt which should be displayed to the user. The same

> resource should then be requested again with the user's input included

> as a query component.



"Included" is a vague word, and should be fixed whether we do appending or

not.


> As far as I can tell, the fix here is for Solderpunk to update the text

> in section 3.2.1 to indicate that if a query string is already part of

> the request leading to an INPUT response, then the user's input should

> be appended (using &) to the existing query string rather than replacing

> it wholesale (using ?).

>


I suggest that if there is no query part, we append ? followed by the

user's input, whereas if there is, we just append the user's input. That

lets a simple form work like this:


1) Suppose Fluffy (a server) wants me to send my name and email address.

Fluffy sends this bare-bones text/gemini document, which we will say comes

from gemini://fluffy.example/form1, to my client Aarfy.


?name=

?email=


2) Let's say I choose the first link. Fluffy sends Arfy 10 Enter your name.

I type John Cowan into Aarfy, which sends the URL

gemini://fluffy.example/form1?session=ABC&name=John%20Cowan. Fluffy sends

this new document to Aarfy:


[John Cowan]: ?session=ABC&name=

?session=ABC&name=John%20Cowan&email=


3) If I choose the first link, I can change my name. If I choose the

second link, Fluffy will send Arfy 10 Enter your email. I type

cowan at ccil.org into Aarfy, which sends the URL

gemini://fluffy.example/form1?session=ABC&name=John%20Cowan&email=

cowan at ccil.org. Fluffy sends this third document to Aarfy:


[John Cowan]: ?session=ABC&email=cowan at ccil.org&name=

[cowan at ccil.org] ?session=ABC&name=John%20Cowan&email=

[cowan at ccil.org] ?session=ABC&name=John%20Cowan&email=

cowan at ccil.org&submit


4) If I choose the first or second link again, I can change my name or

email address. But if I choose the third link, which Fluffy does *not*

interpret as a search link, Fluffy will write my name and email into a

database, or send me an email saying "HA HA HA!", or whatever it does.


Because all that happens is following links and reading input lines, it

does not matter if Aarfy is a CLI, TUI, or GUI client: the protocol

exchanges work in any case. Furthermore, Fluffy does not have to retain

partial state, because it is passed back and forth between Aarfy and Fluffy

with no real interpretation at either end until Aarfy receives a submission

URL.


For that matter there is no real need to have a submission link: an URL

that specifies both name and email could be interpreted as a submission.

As before, this is a matter of design, not protocol.




John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org

There is no real going back. Though I may come to the Shire, it will

not seem the same; for I shall not be the same. I am wounded with

knife, sting, and tooth, and a long burden. Where shall I find rest?

--Frodo

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210130/c841e6c8/attachment.htm>


--------

From: Martin Bays

Date: Sun, 31 Jan 2021 09:51:53 +0100


Saturday, 2021-01-30 at 22:09 -0500 - John Cowan <cowan at ccil.org>:


>> Gemtext in their current incarnations are already sufficiently

>> complete to build interactive CGI applications with them today.

>

>The biggest problem is most likely the cost of setting up and tearing down

>all the TLS connections, but there is no help for that.


Well, there is "0-RTT" TLS session resumption with early data. That

reduces the overheads substantially (though it still requires a fresh

TCP connection for each request). As far as I know no server supports

0-RTT currently, and I think only a few clients do. But it would fit

well with heavy CGI use.

-------------- next part --------------

A non-text attachment was scrubbed...

Name: signature.asc

Type: application/pgp-signature

Size: 195 bytes

Desc: not available

URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210131/e7e5a388/attachment.sig>


--------

From: Gary Johnson

Date: Sun, 31 Jan 2021 13:07:51 -0500


Sean Conner <sean at conman.org> writes:


> Not if the CGI interface is properly written. All I had to do was write

> this CGI script and drop it into my tests directory [1]:

>

> gemini://gemini.conman.org/test/pathseg.cgi

>

> It uses only three of the RFC-3875 defined variables, QUERY_STRING,

> SCRIPT_NAME and PATH_INFO to do all the work. The script will ask for three

> fields and then present a final page with all three fields. But the script

> will only work if all three variables are defined per RFC-3975 (PATH_INFO is

> the tricky one).

>

> Yes, it's a bit ugly and yes, it's a second class citizen and yes, it

> requires a proper CGI module to work, but it can be done without the

> configuration you think it does. The script just simply appends each input

> field as the path, so if you enter

>

> and a one

> and a two

> skidoosh

>

> as the values, the final URL will be:

>

> gemini://gemini.conman.org/test/pathseg.cgi/and%20a%20one/and%20a%20two/skidoosh

>

> Yes, I could have done a bit more processing, naming each segment:

>

> /test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh

>

> but I was lazy and wanted to just do a proof-of-concept here.


Thanks for sharing some code, Sean. I, of course, realize that one could

write a CGI script to pick apart the PATH_INFO for user inputs. This

issue I raised in my message was that this doesn't make any sense in the

context of a CGI script which is looked up using the path on the remote

filesystem.


In your example, your script is located at /test/pathseg.cgi. However,

lacking side information, I see no indicator (outside of the --

admittedly optional -- cgi extension on your file name) of which path

segments should be considered part of the CGI filename lookup and which

parts are meant to be user input data in your example link:


/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh


This feels like a massive hack to me and an abuse of path segments TBH.


If I were to embrace this approach, I can see that I would have to

reprogram my server to do some additional path preprocessing magic. I

could either:


1. Check every sequence of path segments starting from the document root

to see if any of them correspond to an executable file or have the

blessed CGI file extension for my server.


2. (To use Chris Babcock's suggestion), check every sequence of path

segments starting from the document root to see if any of them

correspond to a directory containing an index.cgi file.


3. Include an input parameter to my server (on the command line or in a

config file) that specifies a particular mapping between path

segments and CGI scripts on my filesystem. That is, I would be

defining a routing table at server start time. This approach has the

unfortunate side effect of preventing users on a pubnix from

installing CGI scripts in their ~/public_gemini capsules without

getting the server administrator to update the global routing table

on their behalf. Alternatively, it would require each ~/public_gemini

capsule to include a routing table config file within it if it wanted

to support CGI scripts, and these would have to be read and parsed by

the server both at server start time and/or on a periodic interval or

event-based basis in order to support new user scripts as they are

added without having to restart the server.


Once one of these 3 approaches enables the server to successfully detect

that a particular path corresponds to a CGI script that is not actually

located where that path is pointing, then the server would need to

execute that script with PATH_INFO bound to the entire path. Every

installed CGI script would then be responsible for manually removing

SCRIPT_NAME from PATH_INFO and splitting it up to get the user inputs,

which puts an additional burden on CGI developers.



So I've now heard from multiple folks that we should all just get on

with these path segment hacks and accept that as the best we can do in

Gemini.


While I can see that it's technically possible (though arguable ugly) to

do so, I suppose my question is:


"What exactly does Gemini lose by allowing chained query parameters?

(with &)"


I can't for the life of me see any downside. It should literally be one

line of code changed in your favorite Gemini client. Just append inputs

to the query string if one already exists rather than replacing the

query string outright.


I believe John Cowan is right that "include" is too vague a word in

Solderpunk's current specification for the 10 INPUT field. Both

appending and replacing are forms of inclusion, so any Gemini client

author who chooses to append shouldn't be in violation of the spec as it

is currently worded.


And changing that one line in your client could save every CGI script

writer in Geminispace a lot of additional work (as clearly demonstrated

by the examples shared by Katarina, Chris, and Sean).


Seems like a very positive return on investment for a small change.


What am I missing here, folks?


Any chance of weighing in here, Solderpunk?



With best intentions,

Gary


--

GPG Key ID: 7BC158ED

Use `gpg --search-keys lambdatronic' to find me

Protect yourself from surveillance: https://emailselfdefense.fsf.org

=======================================================================

() ascii ribbon campaign - against html e-mail

/\ www.asciiribbon.org - against proprietary attachments


Why is HTML email a security nightmare? See https://useplaintext.email/


Please avoid sending me MS-Office attachments.

See http://www.gnu.org/philosophy/no-word-attachments.html


--------

From: Sean Conner

Date: Sun, 31 Jan 2021 18:16:18 -0500


It was thus said that the Great Gary Johnson once stated:

> Sean Conner <sean at conman.org> writes:

>

> > Not if the CGI interface is properly written. All I had to do was write

> > this CGI script and drop it into my tests directory [1]:

> >

> > gemini://gemini.conman.org/test/pathseg.cgi


[ snip ]



> Thanks for sharing some code, Sean. I, of course, realize that one could

> write a CGI script to pick apart the PATH_INFO for user inputs. This

> issue I raised in my message was that this doesn't make any sense in the

> context of a CGI script which is looked up using the path on the remote

> filesystem.

>

> In your example, your script is located at /test/pathseg.cgi. However,

> lacking side information, I see no indicator (outside of the --

> admittedly optional -- cgi extension on your file name) of which path

> segments should be considered part of the CGI filename lookup and which

> parts are meant to be user input data in your example link:

>

> /test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh


That's a particular implementation detail of GLV-1.12556 [1]. Other

servers could require the extension, or some other mechanism.


> This feels like a massive hack to me and an abuse of path segments TBH.

>

> If I were to embrace this approach, I can see that I would have to

> reprogram my server to do some additional path preprocessing magic. I

> could either:

>

> 1. Check every sequence of path segments starting from the document root

> to see if any of them correspond to an executable file or have the

> blessed CGI file extension for my server.


I see your server just accepts the requested path as is. GLV-1.12556

(once it gets into the filesystem handler) walks down the document root

checking each path segment looking for an exectuable file (which indicates a

CGI script) or symbolic link (which indicates a SCGI script).


> Once one of these 3 approaches enables the server to successfully detect

> that a particular path corresponds to a CGI script that is not actually

> located where that path is pointing, then the server would need to

> execute that script with PATH_INFO bound to the entire path. Every

> installed CGI script would then be responsible for manually removing

> SCRIPT_NAME from PATH_INFO and splitting it up to get the user inputs,

> which puts an additional burden on CGI developers.


If you want to follow RFC-3875, that's not the case. PATH_INFO only

contans data past the script name (section 4.1.5). This link:


gemini://gemini.conman.org/cgi


returns


SCRIPT_NAME = /cgi


There is no PATH_INFO or PATH_TRANSLATED because it's not needed. However:


gemini://gemini.conman.org/cgi/path/to/nowhere


returns


SCRIPT_NAME = /cgi

PATH_INFO = /path/to/nowhere

PATH_TRANSLATED = /home/spc/projects/gemini/non-checkin/gemini.conman.org/path/to/nowhere


The work is on the server side, not the CGI script side.


> So I've now heard from multiple folks that we should all just get on

> with these path segment hacks and accept that as the best we can do in

> Gemini.

>

> While I can see that it's technically possible (though arguable ugly) to

> do so, I suppose my question is:

>

> "What exactly does Gemini lose by allowing chained query parameters?

> (with &)"


Nothing as far as I can see, as long as the characters '=' and '&' are

escaped if they appear in the input (to prevent confusion).


> What am I missing here, folks?


Somebody to do a proof-of-concept probably.


> Any chance of weighing in here, Solderpunk?


Is he still alive?


-spc


[1] https://github.com/spc476/GLV-1.12556


--------

-- Response ended

-- Page fetched on Sun Jun 2 02:40:16 2024