www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - $750 Bounty: Issue 16416 - Phobos std.uni out of date (should be updated to latest Unicode standard)

reply =?iso-8859-1?Q?Robert_M._M=FCnch?= <robert.muench saphirion.com> writes:
Following my "Is it time for a Unicode update of std.uni?" post in D 
group, I would like to try out to sponsor this effort for "Issue 16416 
- Phobos std.uni out of date (should be updated to latest Unicode 
standard)" [1]

For me, this, too, is an experiment to find out if it's possible to 
move specific issues/topics forward. And maybe even find people that 
are open to contact work too. For me, all these things are pretty 
related.

So, not knowing how much work it is, nor knowing what a good amount 
would be, I took the other route and asked me, what is it worth for me 

likely, I could even live with the current state of std.uni. On the 
other hand, std.uni is a very fundamental building block, and having it 
up to date and maybe even extended should be much value to D.

So, I'm offering $750 to get it done.

Besides getting the work done, there is one constraint: The work needs 
to get into Phobos. It doesn't make sense to have it sit around, 
because it's not being merged. I don't have any clue who is in charge, 
who decides this. Or if there need to be some conditions full-filled so 
that the result gets merged.

[1] https://issues.dlang.org/show_bug.cgi?id=16416

-- 
Robert M. Münch
http://www.saphirion.com
smarter | better | faster
May 04
next sibling parent reply Arine <arine1283798123 gmail.com> writes:
On Monday, 4 May 2020 at 17:01:01 UTC, Robert M. Münch wrote:
 Besides getting the work done, there is one constraint: The 
 work needs to get into Phobos. It doesn't make sense to have it 
 sit around, because it's not being merged. I don't have any 
 clue who is in charge, who decides this. Or if there need to be 
 some conditions full-filled so that the result gets merged.
I feel like this is going to be the biggest obstacle. I worked on a bug bounty in the past, made a pull request, and it just sat there for months. It's a waste of time to try and get anything merged. Especially on the scale that this would be at.
May 04
next sibling parent =?iso-8859-1?Q?Robert_M._M=FCnch?= <robert.muench saphirion.com> writes:
On 2020-05-04 17:30:41 +0000, Arine said:

 I feel like this is going to be the biggest obstacle. I worked on a bug 
 bounty in the past, made a pull request, and it just sat there for 
 months. It's a waste of time to try and get anything merged. Especially 
 on the scale that this would be at.
Thanks for the feedback. Was the PR eventually merged? Did you get any feedback why it wasn't merged, what needs to be done so that it gets merged, who decides this, etc.? -- Robert M. Münch http://www.saphirion.com smarter | better | faster
May 04
prev sibling parent reply welkam <wwwelkam gmail.com> writes:
On Monday, 4 May 2020 at 17:30:41 UTC, Arine wrote:
 On Monday, 4 May 2020 at 17:01:01 UTC, Robert M. Münch wrote:
 Besides getting the work done, there is one constraint: The 
 work needs to get into Phobos. It doesn't make sense to have 
 it sit around, because it's not being merged. I don't have any 
 clue who is in charge, who decides this. Or if there need to 
 be some conditions full-filled so that the result gets merged.
I feel like this is going to be the biggest obstacle.
If changes to phobos do not bring breaking changes then I dont see how update to std.uni might not be merged
May 04
parent =?iso-8859-1?Q?Robert_M._M=FCnch?= <robert.muench saphirion.com> writes:
On 2020-05-04 21:14:49 +0000, welkam said:

 If changes to phobos do not bring breaking changes then I dont see how 
 update to std.uni might not be merged
Well, but that's a weak statement for an invest. If unicode is developing in a way that results in breaking changes, what to do? Never update? Doesn't make sense... So, breaking-changes because unicode requires these, have to be taken IMO. -- Robert M. Münch http://www.saphirion.com smarter | better | faster
May 05
prev sibling next sibling parent reply notna <notna.remove.this ist-einmalig.de> writes:
On Monday, 4 May 2020 at 17:01:01 UTC, Robert M. Münch wrote:
 Following my "Is it time for a Unicode update of std.uni?" post 
 in D group, I would like to try out to sponsor this effort for 
 "Issue 16416 - Phobos std.uni out of date (should be updated to 
 latest Unicode standard)" [1]
 So, I'm offering $750 to get it done.

 Besides getting the work done, there is one constraint: The 
 work needs to get into Phobos. It doesn't make sense to have it 
 sit around, because it's not being merged. I don't have any 
 clue who is in charge, who decides this. Or if there need to be 
 some conditions full-filled so that the result gets merged.

 [1] https://issues.dlang.org/show_bug.cgi?id=16416
Never used std.uni as far as I know ;) _BUT_ I think this is a great initiative, thanks! Maybe you want to add an additional constraint... It would be great if this would result in a tool, scripts or at least a simple-to-follow to-do (say Wiki?!)... so best case we could use this also for the next updates / releases in the future?!
May 04
next sibling parent notna <notna.remove.this ist-einmalig.de> writes:
On Monday, 4 May 2020 at 19:26:28 UTC, notna wrote:
 On Monday, 4 May 2020 at 17:01:01 UTC, Robert M. Münch wrote:
 Following my "Is it time for a Unicode update of std.uni?" 
 post in D group, I would like to try out to sponsor this 
 effort for "Issue 16416 - Phobos std.uni out of date (should 
 be updated to latest Unicode standard)" [1]
 So, I'm offering $750 to get it done.

 Besides getting the work done, there is one constraint: The 
 work needs to get into Phobos. It doesn't make sense to have 
 it sit around, because it's not being merged. I don't have any 
 clue who is in charge, who decides this. Or if there need to 
 be some conditions full-filled so that the result gets merged.

 [1] https://issues.dlang.org/show_bug.cgi?id=16416
Never used std.uni as far as I know ;) _BUT_ I think this is a great initiative, thanks! Maybe you want to add an additional constraint... It would be great if this would result in a tool, scripts or at least a simple-to-follow to-do (say Wiki?!)... so best case we could use this also for the next updates / releases in the future?!
sorry, think this is what you want with 2) :O So just great, BIG thanks!
May 04
prev sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 05/05/2020 7:26 AM, notna wrote:
 Maybe you want to add an additional constraint... It would be great if 
 this would result in a tool, scripts or at least a simple-to-follow 
 to-do (say Wiki?!)... so best case we could use this also for the next 
 updates / releases in the future?!
It wouldn't help. The reason we can't just grab a newer copy of the unicode database and throw it into Phobos is because the format was changed.
May 04
parent reply =?iso-8859-1?Q?Robert_M._M=FCnch?= <robert.muench saphirion.com> writes:
On 2020-05-04 21:34:27 +0000, rikki cattermole said:

 On 05/05/2020 7:26 AM, notna wrote:
 Maybe you want to add an additional constraint... It would be great if 
 this would result in a tool, scripts or at least a simple-to-follow 
 to-do (say Wiki?!)... so best case we could use this also for the next 
 updates / releases in the future?!
The reason we can't just grab a newer copy of the unicode database and throw it into Phobos is because the format was changed.
Sure, nevertheless, I think it makes sense to have the reproducibility of the process in mind. Maybe not with a script that lasts for 10 years. But the process, some tools for a specific version, which can be used as inspiration for upcoming changes. IMO it makes a lot of sense for D to keep up close with the unicode development. -- Robert M. Münch http://www.saphirion.com smarter | better | faster
May 05
parent rikki cattermole <rikki cattermole.co.nz> writes:
On 05/05/2020 9:05 PM, Robert M. Münch wrote:
 On 2020-05-04 21:34:27 +0000, rikki cattermole said:
 
 On 05/05/2020 7:26 AM, notna wrote:
 Maybe you want to add an additional constraint... It would be great 
 if this would result in a tool, scripts or at least a 
 simple-to-follow to-do (say Wiki?!)... so best case we could use this 
 also for the next updates / releases in the future?!
The reason we can't just grab a newer copy of the unicode database and throw it into Phobos is because the format was changed.
Sure, nevertheless, I think it makes sense to have the reproducibility of the process in mind. Maybe not with a script that lasts for 10 years. But the process, some tools for a specific version, which can be used as inspiration for upcoming changes.
Strange, I thought they were in the repo. Okay after looking through his fork of Phobos to see if its lying around somewhere, it looks like we need to hear from Dmitry Olshansky.
May 05
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Monday, 4 May 2020 at 17:01:01 UTC, Robert M. Münch wrote:
 Following my "Is it time for a Unicode update of std.uni?" post 
 in D group, I would like to try out to sponsor this effort for 
 "Issue 16416 - Phobos std.uni out of date (should be updated to 
 latest Unicode standard)" [1]

 For me, this, too, is an experiment to find out if it's 
 possible to move specific issues/topics forward. And maybe even 
 find people that are open to contact work too. For me, all 
 these things are pretty related.

 So, not knowing how much work it is, nor knowing what a good 
 amount would be, I took the other route and asked me, what is 

 not critical for me; most likely, I could even live with the 
 current state of std.uni. On the other hand, std.uni is a very 
 fundamental building block, and having it up to date and maybe 
 even extended should be much value to D.

 So, I'm offering $750 to get it done.
I'm guess I'm not eligible for the bounty ;)
 Besides getting the work done, there is one constraint: The 
 work needs to get into Phobos. It doesn't make sense to have it 
 sit around, because it's not being merged. I don't have any 
 clue who is in charge, who decides this. Or if there need to be 
 some conditions full-filled so that the result gets merged.
Anyhow if anyone wants easy money - shoot me an email, or reply in this thread. Spoiler is - the whole thing is code generated and there is only one table that I forgot about (i.e. I have no idea what is the source table for it in Unicode standard).
 [1] https://issues.dlang.org/show_bug.cgi?id=16416
P.S. I'm kind of back, but very busy and my health is mostly great despite the COVID outrage out there. --- Dmitry Olshansky
May 05
next sibling parent ag0aep6g <anonymous example.com> writes:
On 05.05.20 17:39, Dmitry Olshansky wrote:
 I'm guess I'm not eligible for the bounty ;)
Why wouldn't you be eligible? If it's an easy fix for you, that's just because you've got the needed expert knowledge. And that's valuable.
May 05
prev sibling next sibling parent reply =?iso-8859-1?Q?Robert_M._M=FCnch?= <robert.muench saphirion.com> writes:
On 2020-05-05 15:39:12 +0000, Dmitry Olshansky said:

 On Monday, 4 May 2020 at 17:01:01 UTC, Robert M. Münch wrote:
 Following my "Is it time for a Unicode update of std.uni?" post in D 
 group, I would like to try out to sponsor this effort for "Issue 16416 
 - Phobos std.uni out of date (should be updated to latest Unicode 
 standard)" [1]
 
 For me, this, too, is an experiment to find out if it's possible to 
 move specific issues/topics forward. And maybe even find people that 
 are open to contact work too. For me, all these things are pretty 
 related.
 
 So, not knowing how much work it is, nor knowing what a good amount 
 would be, I took the other route and asked me, what is it worth for me 

 likely, I could even live with the current state of std.uni. On the 
 other hand, std.uni is a very fundamental building block, and having it 
 up to date and maybe even extended should be much value to D.
 
 So, I'm offering $750 to get it done.
I'm guess I'm not eligible for the bounty ;)
Why not?
 Anyhow if anyone wants easy money - shoot me an email, or reply in this thread.
Well, as I wrote, since I don't have a real good understanding about the necessary effort I started from "what is it worth for me in $ to get it done?". So, if it's a simple script-change and a re-run and you are the only one knowing this and keeping it for yourself... yes, it's easy money. On the other hand, if you can help someone to get started and it's a couple of hours, I would expect people to be fair enough and state: Hey, $400 (or whatever) is OK, let's take the rest to sponsor something else. That's what I would do.
 Spoiler is - the whole thing is code generated and there is only one 
 table that I forgot about (i.e. I have no idea what is the source table 
 for it in Unicode standard).
With "forgot" you mean, you can't remember, or it's missing at all in your prior work?
 P.S. I'm kind of back, but very busy and my health is mostly great 
 despite the COVID outrage out there.
That's great to hear... and maybe std.uni support/coaching for someone stepping up is possible. That would be great too. If, maybe even I can try to do it... -- Robert M. Münch http://www.saphirion.com smarter | better | faster
May 05
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Tuesday, 5 May 2020 at 20:11:44 UTC, Robert M. Münch wrote:
 On 2020-05-05 15:39:12 +0000, Dmitry Olshansky said:

 On Monday, 4 May 2020 at 17:01:01 UTC, Robert M. Münch wrote:
 Following my "Is it time for a Unicode update of std.uni?" 
 post in D group, I would like to try out to sponsor this 
 effort for "Issue 16416 - Phobos std.uni out of date (should 
 be updated to latest Unicode standard)" [1]
 
 For me, this, too, is an experiment to find out if it's 
 possible to move specific issues/topics forward. And maybe 
 even find people that are open to contact work too. For me, 
 all these things are pretty related.
 
 So, not knowing how much work it is, nor knowing what a good 
 amount would be, I took the other route and asked me, what is 

 not critical for me; most likely, I could even live with the 
 current state of std.uni. On the other hand, std.uni is a 
 very fundamental building block, and having it up to date and 
 maybe even extended should be much value to D.
 
 So, I'm offering $750 to get it done.
I'm guess I'm not eligible for the bounty ;)
Why not?
Felt a bit like cheating. Russian traditions preclude taking money for things you (think you) wanted to do anyway.
 Anyhow if anyone wants easy money - shoot me an email, or 
 reply in this thread.
Well, as I wrote, since I don't have a real good understanding about the necessary effort I started from "what is it worth for me in $ to get it done?". So, if it's a simple script-change and a re-run and you are the only one knowing this and keeping it for yourself... yes, it's easy money. On the other hand, if you can help someone to get started and it's a couple of hours, I would expect people to be fair enough and state: Hey, $400 (or whatever) is OK, let's take the rest to sponsor something else. That's what I would do.
I started on it, and it turned out a bit more then I hope for + I'm doing it on simple Windows workstation without much of my usual power tools. LDC for Windows works like a charm though. It seems Unicode 13.0.0 pulled a plug on a couple of "derived" tables, that is data files that can be reconsturcted from other primary ones. Took at least half an hour to figure that out and rebuild the missing bits. If you don't mind I'll go with 100$ per hour estimate which is basically my usual contract rate. It took me about 2 hours for now, and I think I'd be done in a one or two more. Merging this into Phobos though is the otehr 90% of the legwork, I hope somebody will help me with that and maybe we'll just split your generous bounty this way.
 Spoiler is - the whole thing is code generated and there is 
 only one table that I forgot about (i.e. I have no idea what 
 is the source table for it in Unicode standard).
With "forgot" you mean, you can't remember, or it's missing at all in your prior work?
I mean I know what this table does by its usage but the codegen part is missing, likely a classic missing commit problem of being a single maintainer of the codegen tool (and the fact that it's not in the main dlang repos).
 P.S. I'm kind of back, but very busy and my health is mostly 
 great despite the COVID outrage out there.
That's great to hear... and maybe std.uni support/coaching for someone stepping up is possible. That would be great too. If, maybe even I can try to do it...
Absolutely. I mean I'm in no shape to do the heavy lifting of day in day out maintanance of std.* stuff but I'd love to coach somebody to learn how std.regex and std.uni work. I can also share my vision for improvement, and will gladly help with refactoring. Both of modules predate many of the good things in DLang and std.allocator in particular. Boy, I'd love to have allocators back in the day. -- Dmitry Olshansky
May 05
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Tuesday, 5 May 2020 at 21:41:39 UTC, Dmitry Olshansky wrote:
 On Tuesday, 5 May 2020 at 20:11:44 UTC, Robert M. Münch wrote:
 On 2020-05-05 15:39:12 +0000, Dmitry Olshansky said:

 On the other hand, if you can help someone to get started and 
 it's a couple of hours, I would expect people to be fair 
 enough and state: Hey, $400 (or whatever) is OK, let's take 
 the rest to sponsor something else. That's what I would do.
So here goes, indeed about 4.5 hours so far. The generator is untangled from the old crap from that GSOC 2012 repo: https://github.com/DmitryOlshansky/gen-uni-dlang PR: https://github.com/dlang/phobos/pull/7469 Let's see if the CI loves it or not.
May 05
prev sibling parent =?iso-8859-1?Q?Robert_M._M=FCnch?= <robert.muench saphirion.com> writes:
On 2020-05-05 21:41:39 +0000, Dmitry Olshansky said:

 Felt a bit like cheating. Russian traditions preclude taking money for things
 you (think you) wanted to do anyway.
Well, that's a good habit and still IMO it's OK to offer and take an incentive.
 I started on it, and it turned out a bit more then I hope for + I'm 
 doing it on simple Windows workstation without much of my usual power 
 tools. LDC for Windows works like a charm though.
 
 It seems Unicode 13.0.0 pulled a plug on a couple of "derived" tables, 
 that is data files that can be reconsturcted from other primary ones. 
 Took at least half an hour to figure that out and rebuild the missing 
 bits.
 
 If you don't mind I'll go with 100$ per hour estimate which is 
 basically my usual contract rate. It took me about 2 hours for now, and 
 I think I'd be done in a one or two more.
Great and deal.
 Merging this into Phobos though is the otehr 90% of the legwork, I hope 
 somebody will help me with that and maybe we'll just split your 
 generous bounty this way.
Sure. As said, I'm not totally sure how this code-merging process works, who can do it, who approves things (if at all) or if it's enough of the automated tests don't fail.
 I mean I know what this table does by its usage but the codegen part is 
 missing, likely a classic missing commit problem of being a single 
 maintainer of the codegen tool (and the fact that it's not in the main 
 dlang repos).
Got it.
 Absolutely. I mean I'm in no shape to do the heavy lifting of day in 
 day out maintanance of std.* stuff but I'd love to coach somebody to 
 learn how std.regex and std.uni work. I can also share my vision for 
 improvement, and will gladly help with refactoring.
With focus on std.uni I think what would help is a short description of the whole process. A context setting chapter "unicode pitfalls, important things to know, general process" and a "step-by-step" description/log of what needs to be done, what the step does and where it fits into the overall picture. Just found out that std.regex is from you too... nice. -- Robert M. Münch http://www.saphirion.com smarter | better | faster
May 06
prev sibling parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Tuesday, 5 May 2020 at 15:39:12 UTC, Dmitry Olshansky wrote:
 P.S. I'm kind of back, but very busy and my health is mostly 
 great despite the COVID outrage out there.
That's great! Glad to hear that. -- Bastiaan.
May 12
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Tuesday, 12 May 2020 at 07:21:43 UTC, Bastiaan Veelo wrote:
 On Tuesday, 5 May 2020 at 15:39:12 UTC, Dmitry Olshansky wrote:
 P.S. I'm kind of back, but very busy and my health is mostly 
 great despite the COVID outrage out there.
That's great! Glad to hear that.
Bastian! Great to see you still around. How your D stuff is going at that naval company?
 -- Bastiaan.
May 12
parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Tuesday, 12 May 2020 at 07:48:46 UTC, Dmitry Olshansky wrote:
 On Tuesday, 12 May 2020 at 07:21:43 UTC, Bastiaan Veelo wrote:
 On Tuesday, 5 May 2020 at 15:39:12 UTC, Dmitry Olshansky wrote:
 P.S. I'm kind of back, but very busy and my health is mostly 
 great despite the COVID outrage out there.
That's great! Glad to hear that.
Bastian! Great to see you still around. How your D stuff is going at that naval company?
First real application is running: a program for the numerical analysis of a ship launch at the yard. Currently testing and debugging. Pain points typically revolve around low level tricks in Pascal using arrays starting at 1 (these usually translate without problems, except where they don't)... Or passing strings to/from win32. Still committed to translate all other programs in our suite to D, busy times as usual. -- Bastiaan.
May 12
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Tuesday, 12 May 2020 at 08:11:03 UTC, Bastiaan Veelo wrote:
 On Tuesday, 12 May 2020 at 07:48:46 UTC, Dmitry Olshansky wrote:
 Bastian! Great to see you still around.

 How your D stuff is going at that naval company?
First real application is running: a program for the numerical analysis of a ship launch at the yard. Currently testing and debugging. Pain points typically revolve around low level tricks in Pascal using arrays starting at 1 (these usually translate without problems, except where they don't)... Or passing strings to/from win32. Still committed to translate all other programs in our suite to D, busy times as usual.
Cool stuff. Keep it rolling ;)
 -- Bastiaan.
May 12
prev sibling parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Monday, 4 May 2020 at 17:01:01 UTC, Robert M. Münch wrote:
 ...
I believe this is an excellent initiative, thank you for starting it! Perhaps this script, along with repository that is part of can help those wishing to update std.uni to the latest version: https://github.com/DmitryOlshansky/gsoc-bench-2012/blob/master/gen_uni.d With regard to the rate of pull requests being merged into the core repositories, I would say that it is highly contextually dependent. I strongly advise either: a) subscribing for notifications from the core dlang repositories (dmd, druntime, phobos, dub, etc.) for an extended period of time (3 months min) - you'll be able to observe the group dynamics (e.g. which contributors have experience with which part of the codebase, why some things are merged quickly and others take a while, etc.) - this way you can really draw conclusions for yourself (b) looking at the statistics: - https://github.com/dlang/dmd/pulse/monthly - https://github.com/dlang/druntime/pulse/monthly - https://github.com/dlang/phobos/pulse/monthly - https://github.com/dlang/dub/pulse/monthly as opposed to drawing conclusions from single data points of anecdotal evidence. From my several years of experience, I can say the following: - small, less complex pull requests are generally easy to get merged - it depends on the part of the codebase - if you open a pull request for a part whose maintainers are currently active, you can expect a speedy review. If it's a part (e.g. std.regex) that is both highly complex and with a small number of maintainers, then it may take a while) - teamwork and communication - since all of us are living in different time zones, rather than working in the same office, you should be prepared that communication (which is a prerequisite of merging) will be with high-latency. Changes that are described well, for which the benefit is clear and doesn't look like they may introduce regressions are of course received well. Discussion prior to opening a merge request can help to guide the implementation in the right direction and save time later in the review process. Many contributors are active on the dlang Slack [1] which makes it a good place to ping people for feedback, or just to have a near real-time conversation. In the past 1-3 years, I have noticed a trend that many active contributors are mostly active on GitHub and Slack, rather than the newsgroup. If you see that pull request has fallen through the cracks (no new replies from maintainers), don't hesitate to ping us either there or here on the newsgroup. [1]: https://dlang.slack.com/
May 05