digitalmars.D - DCT use cases

digitalmars.D - DCT use cases - draft

Roman D. Boiko (1/1) May 22 2012 http://d-coding.com/2012/05/22/dct-use-cases.html

Gor Gyolchanyan (5/6) May 22 2012 Oh, boy! IMO, that's gonna be THE most useful thing related to D so far.
Dmitry Olshansky (8/11) May 22 2012 Love this idea. Some sort of cool hierarchical pattern syntax would save...

Roman D. Boiko (3/12) May 22 2012 The complicated part is conceptual, efficiency is feasible to

Dmitry Olshansky (19/33) May 22 2012 Well for starters:

Roman D. Boiko (4/43) May 22 2012 This is a possible, although not the only, option. Anyway I'm not

Roman D. Boiko (4/11) May 22 2012 After thinking a bit more, I decided to investigate the topic

Dmitry Olshansky (14/25) May 22 2012 Yeah I still think you tried to dismiss it too early.

Roman D. Boiko (8/21) May 22 2012 And likely the only feasible in D. It doesn't support lazy

Dmitry Olshansky (6/27) May 22 2012 Prefer simple for prototype. Good old nested hash table is fine for

Roman D. Boiko (6/44) May 22 2012 Unfortunately, hash tables don't play nicely with immutability of

Denis Shelomovskij (20/21) May 22 2012 Please, please, try to rape dmd to do what you want first because

Roman D. Boiko (2/22) May 22 2012 Thanks, but what you described is outside DCT scope and goals.
Roman D. Boiko (2/22) May 22 2012 Thanks, but what you described is outside DCT scope and goals.

deadalnix (9/10) May 22 2012 Nice idea but I'll not be nice.
Jacob Carlborg (4/5) May 22 2012 I'm really not sure I understand what this is. Is this use cases

Roman D. Boiko (9/14) May 22 2012 This is a draft of use cases for all DCT libraries combined.

Jacob Carlborg (37/43) May 22 2012 That's a good point.

Roman D. Boiko (21/70) May 22 2012 Yeah, about 50% is lexing. I pay more attention to it because

Jacob Carlborg (6/8) May 22 2012 Wouldn't it be better to start with the use cases? You probably already

Roman D. Boiko (3/11) May 22 2012 Yes, that's the intention. I meant that *before Lexer* I must

deadalnix (3/81) May 22 2012 Providing it as a Range of Tokens would be the awesomest design decision...

Roman D. Boiko (2/20) May 22 2012 Indeed :)
Roman D. Boiko (2/20) May 22 2012 Agree

Roman D. Boiko (3/4) May 23 2012 Posted an updated version, but it is still a draft:

Jacob Carlborg (4/8) May 23 2012 That's a lot better :)
David Piepgrass (46/50) Jul 25 2012 I think one of the key challenges will be incremental updates.

Jacob Carlborg (5/11) Jul 26 2012 It would be nice if not even lexing or parsing needed to be done on the

David Piepgrass (16/20) Jul 25 2012 BTW, have you seen the video by Bret Victor entitled "Inventing

Jacob Carlborg (4/18) Jul 26 2012 That is so cool :)

"Roman D. Boiko" <rb d-coding.com> writes:

http://d-coding.com/2012/05/22/dct-use-cases.html

May 22 2012

Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:

On Tue, May 22, 2012 at 6:04 PM, Roman D. Boiko <rb d-coding.com> wrote:

 http://d-coding.com/2012/05/**22/dct-use-cases.html<http://d-coding.com/2012/05/22/dct-use-cases.html>

Oh, boy! IMO, that's gonna be THE most useful thing related to D so far.

-- 
Bye,
Gor Gyolchanyan.

May 22 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 22.05.2012 18:04, Roman D. Boiko wrote:
 http://d-coding.com/2012/05/22/dct-use-cases.html

Find parts of code model by a query (this single point will be 
expanded into its own series of posts)


Love this idea. Some sort of cool hierarchical pattern syntax would save 
megatons of boilerplate. Efficiency may suffer a bit but we have this 
CTFE stuff for hardcoded queries ;).

I can easily envision a code transformation of the whole project 
represented as a bunch of regex-like expressions.

-- 
Dmitry Olshansky

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 14:42:28 UTC, Dmitry Olshansky wrote:
 On 22.05.2012 18:04, Roman D. Boiko wrote:
 http://d-coding.com/2012/05/22/dct-use-cases.html

Find parts of code model by a query (this single point will


 be >>expanded into its own series of posts)

 Love this idea. Some sort of cool hierarchical pattern syntax 
 would save megatons of boilerplate. Efficiency may suffer a bit 
 but we have this CTFE stuff for hardcoded queries ;).

 I can easily envision a code transformation of the whole 
 project represented as a bunch of regex-like expressions.

The complicated part is conceptual, efficiency is feasible to 
achieve.

May 22 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 22.05.2012 18:45, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 14:42:28 UTC, Dmitry Olshansky wrote:
 On 22.05.2012 18:04, Roman D. Boiko wrote:
 http://d-coding.com/2012/05/22/dct-use-cases.html

Find parts of code model by a query (this single point will


 be >>expanded into its own series of posts)

 Love this idea. Some sort of cool hierarchical pattern syntax would
 save megatons of boilerplate. Efficiency may suffer a bit but we have
 this CTFE stuff for hardcoded queries ;).

 I can easily envision a code transformation of the whole project
 represented as a bunch of regex-like expressions.

 The complicated part is conceptual, efficiency is feasible to achieve.

Well for starters:

Abuse regex syntax, ban '/' w/o escape & use '/' for scopes/level, 
introduce a bunch of better wildcards/character sets like "identifier" 
etc. Throw in annotations that work like a list of tags the symbol 
should(n't) have: constant, template, function, class, struct, ... see 
traits. Same goes for protection & other annotations.

Then it goes like:

demo/{function, pure}f.*/i
->
j
(e.g. simple rename, refactoring looks similarly)

To fetch all of local symbols named i inside of pure functions that 
begin with 'f' inside entity demo  (entity = struct, class, template, 
module depending on where you apply it) and rename them to j.

I suggest to go with such kind of improvised notation with more examples 
until you fell that semantics are crystal clear.


-- 
Dmitry Olshansky

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 14:56:42 UTC, Dmitry Olshansky wrote:
 On 22.05.2012 18:45, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 14:42:28 UTC, Dmitry Olshansky 
 wrote:
 On 22.05.2012 18:04, Roman D. Boiko wrote:
 http://d-coding.com/2012/05/22/dct-use-cases.html

Find parts of code model by a query (this single point will


 be >>expanded into its own series of posts)

 Love this idea. Some sort of cool hierarchical pattern syntax 
 would
 save megatons of boilerplate. Efficiency may suffer a bit but 
 we have
 this CTFE stuff for hardcoded queries ;).

 I can easily envision a code transformation of the whole 
 project
 represented as a bunch of regex-like expressions.

 The complicated part is conceptual, efficiency is feasible to 
 achieve.

 Well for starters:

 Abuse regex syntax, ban '/' w/o escape & use '/' for 
 scopes/level, introduce a bunch of better wildcards/character 
 sets like "identifier" etc. Throw in annotations that work like 
 a list of tags the symbol should(n't) have: constant, template, 
 function, class, struct, ... see traits. Same goes for 
 protection & other annotations.

 Then it goes like:

 demo/{function, pure}f.*/i
 ->
 j
 (e.g. simple rename, refactoring looks similarly)

 To fetch all of local symbols named i inside of pure functions 
 that begin with 'f' inside entity demo  (entity = struct, 
 class, template, module depending on where you apply it) and 
 rename them to j.

 I suggest to go with such kind of improvised notation with more 
 examples until you fell that semantics are crystal clear.

This is a possible, although not the only, option. Anyway I'm not
ready for designing the details yet. There is a lot to do before
that.

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 15:15:49 UTC, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 14:56:42 UTC, Dmitry Olshansky wrote:
 I suggest to go with such kind of improvised notation with 
 more examples until you fell that semantics are crystal clear.

 This is a possible, although not the only, option. Anyway I'm 
 not
 ready for designing the details yet. There is a lot to do before
 that.

After thinking a bit more, I decided to investigate the topic
early, although not immediately :) It should help deciding which
indices are needed, where are rough edges of API, etc.

May 22 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 22.05.2012 19:36, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 15:15:49 UTC, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 14:56:42 UTC, Dmitry Olshansky wrote:
 I suggest to go with such kind of improvised notation with more
 examples until you fell that semantics are crystal clear.

 This is a possible, although not the only, option. Anyway I'm not
 ready for designing the details yet. There is a lot to do before
 that.

 After thinking a bit more, I decided to investigate the topic
 early, although not immediately :) It should help deciding which
 indices are needed, where are rough edges of API, etc.

Yeah I still think you tried to dismiss it too early.

In my mind it's encountered rather soon:

lex/scan -> postprocess (may be combined with lex) -> populate symbol 
tables (ditto  - with next step/previous step) -> parse to AST -> ...

That symbol table should contain all of the rich hierarchy of modules. 
That is the actual compiler is able to have a single stack of scopes 
that it pushes/pops as it processes code. Your DCT on the other hand 
should have all of local scopes (of every entity) at once.

It may be possible to simulate it with some deal of laziness, but I 
guess keeping the whole symbol table is the easiest and the fastest way 
still. Watch out for some mind-bending data structures :)

-- 
Dmitry Olshansky

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 16:03:49 UTC, Dmitry Olshansky wrote:
 In my mind it's encountered rather soon:

 lex/scan -> postprocess (may be combined with lex) -> populate 
 symbol tables (ditto  - with next step/previous step) -> parse 
 to AST -> ...
 That symbol table should contain all of the rich hierarchy of 
 modules. That is the actual compiler is able to have a single 
 stack of scopes that it pushes/pops as it processes code. Your 
 DCT on the other hand should have all of local scopes (of every 
 entity) at once.

 It may be possible to simulate it with some deal of laziness, 
 but I guess keeping the whole symbol table is the easiest and 
 the fastest way still.

And likely the only feasible in D. It doesn't support lazy 
evaluation for immutable data structures, and immutability is 
necessary for most use cases.

 Watch out for some mind-bending data structures :)

Do you mean not to overcomplicate? Or use classic data 
structures? Or something else?

So far I think immutable red-black trees will be central in DCT 
architecture, as well as some others.

May 22 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 22.05.2012 20:47, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 16:03:49 UTC, Dmitry Olshansky wrote:
 In my mind it's encountered rather soon:

 lex/scan -> postprocess (may be combined with lex) -> populate symbol
 tables (ditto - with next step/previous step) -> parse to AST -> ...
 That symbol table should contain all of the rich hierarchy of modules.
 That is the actual compiler is able to have a single stack of scopes
 that it pushes/pops as it processes code. Your DCT on the other hand
 should have all of local scopes (of every entity) at once.

 It may be possible to simulate it with some deal of laziness, but I
 guess keeping the whole symbol table is the easiest and the fastest
 way still.

 And likely the only feasible in D. It doesn't support lazy evaluation
 for immutable data structures, and immutability is necessary for most
 use cases.

 Watch out for some mind-bending data structures :)

 Do you mean not to overcomplicate? Or use classic data structures? Or
 something else?

Prefer simple for prototype. Good old nested hash table is fine for 
starters.

 So far I think immutable red-black trees will be central in DCT
 architecture, as well as some others.

Cool ;) Though why not Tries then if the data is immutable?

-- 
Dmitry Olshansky

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 16:55:46 UTC, Dmitry Olshansky wrote:
 On 22.05.2012 20:47, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 16:03:49 UTC, Dmitry Olshansky 
 wrote:
 In my mind it's encountered rather soon:

 lex/scan -> postprocess (may be combined with lex) -> 
 populate symbol
 tables (ditto - with next step/previous step) -> parse to AST 
 -> ...
 That symbol table should contain all of the rich hierarchy of 
 modules.
 That is the actual compiler is able to have a single stack of 
 scopes
 that it pushes/pops as it processes code. Your DCT on the 
 other hand
 should have all of local scopes (of every entity) at once.

 It may be possible to simulate it with some deal of laziness, 
 but I
 guess keeping the whole symbol table is the easiest and the 
 fastest
 way still.

 And likely the only feasible in D. It doesn't support lazy 
 evaluation
 for immutable data structures, and immutability is necessary 
 for most
 use cases.

 Watch out for some mind-bending data structures :)

 Do you mean not to overcomplicate? Or use classic data 
 structures? Or
 something else?

 Prefer simple for prototype. Good old nested hash table is fine 
 for starters.

Unfortunately, hash tables don't play nicely with immutability of 
data - although interface could be restricted to prevent 
mutation, they would not be able to reuse memory space, and thus 
would require copying.

 So far I think immutable red-black trees will be central in DCT
 architecture, as well as some others.

 Cool ;) Though why not Tries then if the data is immutable?

Those too :) For different purposes than the former.

May 22 2012

Denis Shelomovskij <verylonglogin.reg gmail.com> writes:

22.05.2012 18:04, Roman D. Boiko написал:
 http://d-coding.com/2012/05/22/dct-use-cases.html

Please, please, try to rape dmd to do what you want first because 
otherwise you (like every other existing parsers in IDE) will fail with 
templates which are used everywhere in D (I mean std.algorithm).

A suggestion:
   Step 1 (bad performance): pass whole UTF-8 encoded source to dmd for 
recompilation every time (through memory mapped file, e.g.) and force 
dmd to write everything you want (yes, to mmfile because there will be a 
lot of information).
   Step 2 (better performance): stop dmd at some compilation stage (in a 
function context) and on user input fork dmd, give it new data, execute 
it, kill it. Pass whole source only when e.g. user start editing of 
another function.

It looks like we can't do anything better without good compiler-as-library.


By the way, it was really easy to change dmd to produce token 
information for token colonizing and it worked faster that Eclipse's 
Descent IIRC.

-- 
Денис В. Шеломовский
Denis V. Shelomovskij

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 14:48:49 UTC, Denis Shelomovskij wrote:
 22.05.2012 18:04, Roman D. Boiko написал:
 http://d-coding.com/2012/05/22/dct-use-cases.html

 Please, please, try to rape dmd to do what you want first 
 because otherwise you (like every other existing parsers in 
 IDE) will fail with templates which are used everywhere in D (I 
 mean std.algorithm).

 A suggestion:
   Step 1 (bad performance): pass whole UTF-8 encoded source to 
 dmd for recompilation every time (through memory mapped file, 
 e.g.) and force dmd to write everything you want (yes, to 
 mmfile because there will be a lot of information).
   Step 2 (better performance): stop dmd at some compilation 
 stage (in a function context) and on user input fork dmd, give 
 it new data, execute it, kill it. Pass whole source only when 
 e.g. user start editing of another function.

 It looks like we can't do anything better without good 
 compiler-as-library.


 By the way, it was really easy to change dmd to produce token 
 information for token colonizing and it worked faster that 
 Eclipse's Descent IIRC.

Thanks, but what you described is outside DCT scope and goals.

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 14:48:49 UTC, Denis Shelomovskij wrote:
 22.05.2012 18:04, Roman D. Boiko написал:
 http://d-coding.com/2012/05/22/dct-use-cases.html

 Please, please, try to rape dmd to do what you want first 
 because otherwise you (like every other existing parsers in 
 IDE) will fail with templates which are used everywhere in D (I 
 mean std.algorithm).

 A suggestion:
   Step 1 (bad performance): pass whole UTF-8 encoded source to 
 dmd for recompilation every time (through memory mapped file, 
 e.g.) and force dmd to write everything you want (yes, to 
 mmfile because there will be a lot of information).
   Step 2 (better performance): stop dmd at some compilation 
 stage (in a function context) and on user input fork dmd, give 
 it new data, execute it, kill it. Pass whole source only when 
 e.g. user start editing of another function.

 It looks like we can't do anything better without good 
 compiler-as-library.


 By the way, it was really easy to change dmd to produce token 
 information for token colonizing and it worked faster that 
 Eclipse's Descent IIRC.

Thanks, but what you described is outside DCT scope and goals.

May 22 2012

deadalnix <deadalnix gmail.com> writes:

Le 22/05/2012 16:04, Roman D. Boiko a écrit :
 http://d-coding.com/2012/05/22/dct-use-cases.html

Nice idea but I'll not be nice.

What is needed here is design. The needs are not really news nor 
specific to D.

So far, you have ignored all existing project on the subject, and shown 
that you tend to overcomplicate design when it isn't required. I don't 
see here the right attitude to come up with a tool that will :
1/ Not repeat error previously done.
2/ Get resused.

May 22 2012

"Jacob Carlborg" <doob me.com> writes:

On Tuesday, 22 May 2012 at 14:04:18 UTC, Roman D. Boiko wrote:
 http://d-coding.com/2012/05/22/dct-use-cases.html

I'm really not sure I understand what this is. Is this use cases 
for a lexer or some other more high level too?. Because I don't 
see what projects and workspaces have to do with a lexer.

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 17:06:43 UTC, Jacob Carlborg wrote:
 On Tuesday, 22 May 2012 at 14:04:18 UTC, Roman D. Boiko wrote:
 http://d-coding.com/2012/05/22/dct-use-cases.html

 I'm really not sure I understand what this is. Is this use 
 cases for a lexer or some other more high level too?. Because I 
 don't see what projects and workspaces have to do with a lexer.

This is a draft of use cases for all DCT libraries combined.

Scope for DCT is to provide semantic analysis, but not code 
generation (that may become another project some time). 
Information about projects, etc., is useful for e.g., analysing 
dependencies.

I'll improve overall structure and add some explanations + 
examples tomorrow. Could you elaborate on specific points which 
are vague?

May 22 2012

Jacob Carlborg <doob me.com> writes:

On 2012-05-22 19:14, Roman D. Boiko wrote:

 This is a draft of use cases for all DCT libraries combined.

This seems to be mostly focused on lexing? See below for some ideas.

 Scope for DCT is to provide semantic analysis, but not code generation
 (that may become another project some time). Information about projects,
 etc., is useful for e.g., analysing dependencies.

That's a good point.

 I'll improve overall structure and add some explanations + examples
 tomorrow. Could you elaborate on specific points which are vague?

I would probably have specified some high level use cases first, like:

* IDE integration
* Refactoring tool
* Static analysis
* Compiler
* Doc generating
* Build tool

In general, use cases that can span several compile phases, i.e. lexing, 
parsing, semantic analysis and so on. Some of these use cases can be 
broken in to several new use cases at a lower level. Some examples:

IDE integration:

* Syntax highlighting
* Code completion
* Showing lex, syntax and semantic errors

Refactoring:

* Cross-referencing symbols

Build tool:

* Tracking module dependencies

Doc generating:

* Associate a declaration and its documentation

Some of these "sub" use cases are needed by several tools, then you can 
either repeat them or pick unique sub use cases for each high level use 
case.

Then you can get into more detail over lower level use cases for the 
different compile phases. If you have enough to write you could probably 
have a post about the use cases for each phase.

It seems some of your use cases are implementation details or design 
goals, like "Store text efficiently".

It would not be necessary to start with the high level goals, but it 
would be nice. The next best thing would probably be to start with the 
use cases compiler phase you already have started on, that is lexing, if 
I have understood everything correctly.

-- 
/Jacob Carlborg

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 18:10:59 UTC, Jacob Carlborg wrote:
 On 2012-05-22 19:14, Roman D. Boiko wrote:

 This is a draft of use cases for all DCT libraries combined.

 This seems to be mostly focused on lexing? See below for some 
 ideas.

Yeah, about 50% is lexing. I pay more attention to it because 
lexer alone is enough for several uses. I would like to have at 
least some functionality used as early as possible, this would 
provide me great feedback.

 Scope for DCT is to provide semantic analysis, but not code 
 generation (that may become another project some time). 
 Information about projects, etc., is useful for e.g., 
 analysing dependencies.

 That's a good point.

 I'll improve overall structure and add some explanations + 
 examples
 tomorrow. Could you elaborate on specific points which are 
 vague?

 I would probably have specified some high level use cases 
 first, like:

 * IDE integration
 * Refactoring tool
 * Static analysis
 * Compiler
 * Doc generating
 * Build tool

Thanks! I didn't think about build tool, for exapmle.

I started this way, but after your comment on my previous post 
that there is nothing new I reconsidered my approach and decided 
to start from concrete (low-lewel), then improve it according to 
feedback, and then split into areas (which roughly correspond to 
your hi-level use cases).

 In general, use cases that can span several compile phases, 
 i.e. lexing, parsing, semantic analysis and so on. Some of 
 these use cases can be broken in to several new use cases at a 
 lower level. Some examples:

 IDE integration:

 * Syntax highlighting
 * Code completion
 * Showing lex, syntax and semantic errors

 Refactoring:

 * Cross-referencing symbols

 Build tool:

 * Tracking module dependencies

 Doc generating:

 * Associate a declaration and its documentation

 Some of these "sub" use cases are needed by several tools, then 
 you can either repeat them or pick unique sub use cases for 
 each high level use case.

 Then you can get into more detail over lower level use cases 
 for the different compile phases. If you have enough to write 
 you could probably have a post about the use cases for each 
 phase.

Thanks for examples.

 It seems some of your use cases are implementation details or 
 design goals, like "Store text efficiently".

Actually, many of those are architectural (although low-level), 
because they are key to achieve the project goals, and failing in 
this area could cause overall failure. I intend to move any 
non-architectural information into a separate series of posts, 
feel free commenting what you don't consider important for the 
architecture (probably don't start yet, I'm reviewing text right 
now).

 It would not be necessary to start with the high level goals, 
 but it would be nice. The next best thing would probably be to 
 start with the use cases compiler phase you already have 
 started on, that is lexing, if I have understood everything 
 correctly.

Yes, and even before that I'm going to document some fundamental 
primitives, like immutability and core data structures.

May 22 2012

Jacob Carlborg <doob me.com> writes:

On 2012-05-22 20:33, Roman D. Boiko wrote:

 Yes, and even before that I'm going to document some fundamental
 primitives, like immutability and core data structures.

Wouldn't it be better to start with the use cases? You probably already 
have a fairly good idea about the use cases, but in theory the use cases 
could change how the data structure might look like.

-- 
/Jacob Carlborg

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 18:59:48 UTC, Jacob Carlborg wrote:
 On 2012-05-22 20:33, Roman D. Boiko wrote:

 Yes, and even before that I'm going to document some 
 fundamental
 primitives, like immutability and core data structures.

 Wouldn't it be better to start with the use cases? You probably 
 already have a fairly good idea about the use cases, but in 
 theory the use cases could change how the data structure might 
 look like.

Yes, that's the intention. I meant that *before Lexer* I must 
deal with some critical (fundamental) primitives.

May 22 2012

deadalnix <deadalnix gmail.com> writes:

Le 22/05/2012 20:33, Roman D. Boiko a écrit :
 On Tuesday, 22 May 2012 at 18:10:59 UTC, Jacob Carlborg wrote:
 On 2012-05-22 19:14, Roman D. Boiko wrote:

 This is a draft of use cases for all DCT libraries combined.

 This seems to be mostly focused on lexing? See below for some ideas.

 Yeah, about 50% is lexing. I pay more attention to it because lexer
 alone is enough for several uses. I would like to have at least some
 functionality used as early as possible, this would provide me great
 feedback.

Providing it as a Range of Tokens would be the awesomest design decision 
you could ever make.

 Scope for DCT is to provide semantic analysis, but not code
 generation (that may become another project some time). Information
 about projects, etc., is useful for e.g., analysing dependencies.

 That's a good point.

 I'll improve overall structure and add some explanations + examples
 tomorrow. Could you elaborate on specific points which are vague?

 I would probably have specified some high level use cases first, like:

 * IDE integration
 * Refactoring tool
 * Static analysis
 * Compiler
 * Doc generating
 * Build tool

 Thanks! I didn't think about build tool, for exapmle.

 I started this way, but after your comment on my previous post that
 there is nothing new I reconsidered my approach and decided to start
 from concrete (low-lewel), then improve it according to feedback, and
 then split into areas (which roughly correspond to your hi-level use
 cases).

 In general, use cases that can span several compile phases, i.e.
 lexing, parsing, semantic analysis and so on. Some of these use cases
 can be broken in to several new use cases at a lower level. Some
 examples:

 IDE integration:

 * Syntax highlighting
 * Code completion
 * Showing lex, syntax and semantic errors

 Refactoring:

 * Cross-referencing symbols

 Build tool:

 * Tracking module dependencies

 Doc generating:

 * Associate a declaration and its documentation

 Some of these "sub" use cases are needed by several tools, then you
 can either repeat them or pick unique sub use cases for each high
 level use case.

 Then you can get into more detail over lower level use cases for the
 different compile phases. If you have enough to write you could
 probably have a post about the use cases for each phase.

 Thanks for examples.

 It seems some of your use cases are implementation details or design
 goals, like "Store text efficiently".

 Actually, many of those are architectural (although low-level), because
 they are key to achieve the project goals, and failing in this area
 could cause overall failure. I intend to move any non-architectural
 information into a separate series of posts, feel free commenting what
 you don't consider important for the architecture (probably don't start
 yet, I'm reviewing text right now).

 It would not be necessary to start with the high level goals, but it
 would be nice. The next best thing would probably be to start with the
 use cases compiler phase you already have started on, that is lexing,
 if I have understood everything correctly.

 Yes, and even before that I'm going to document some fundamental
 primitives, like immutability and core data structures.

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 20:31:40 UTC, deadalnix wrote:
 Le 22/05/2012 20:33, Roman D. Boiko a écrit :
 On Tuesday, 22 May 2012 at 18:10:59 UTC, Jacob Carlborg wrote:
 On 2012-05-22 19:14, Roman D. Boiko wrote:

 This is a draft of use cases for all DCT libraries combined.

 This seems to be mostly focused on lexing? See below for some 
 ideas.

 Yeah, about 50% is lexing. I pay more attention to it because 
 lexer
 alone is enough for several uses. I would like to have at 
 least some
 functionality used as early as possible, this would provide me 
 great
 feedback.

 Providing it as a Range of Tokens would be the awesomest design 
 decision you could ever make.

Indeed :)

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 20:31:40 UTC, deadalnix wrote:
 Le 22/05/2012 20:33, Roman D. Boiko a écrit :
 On Tuesday, 22 May 2012 at 18:10:59 UTC, Jacob Carlborg wrote:
 On 2012-05-22 19:14, Roman D. Boiko wrote:

 This is a draft of use cases for all DCT libraries combined.

 This seems to be mostly focused on lexing? See below for some 
 ideas.

 Yeah, about 50% is lexing. I pay more attention to it because 
 lexer
 alone is enough for several uses. I would like to have at 
 least some
 functionality used as early as possible, this would provide me 
 great
 feedback.

 Providing it as a Range of Tokens would be the awesomest design 
 decision you could ever make.

Agree

May 22 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Tuesday, 22 May 2012 at 18:33:38 UTC, Roman D. Boiko wrote:
 I'm reviewing text right now

Posted an updated version, but it is still a draft:

http://d-coding.com/2012/05/23/dct-use-cases-revised.html

May 23 2012

Jacob Carlborg <doob me.com> writes:

On 2012-05-23 17:36, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 18:33:38 UTC, Roman D. Boiko wrote:
 I'm reviewing text right now

 Posted an updated version, but it is still a draft:

 http://d-coding.com/2012/05/23/dct-use-cases-revised.html

That's a lot better :)

-- 
/Jacob Carlborg

May 23 2012

"David Piepgrass" <qwertie256 gmail.com> writes:

On Wednesday, 23 May 2012 at 15:36:59 UTC, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 18:33:38 UTC, Roman D. Boiko wrote:
 I'm reviewing text right now

 Posted an updated version, but it is still a draft:

 http://d-coding.com/2012/05/23/dct-use-cases-revised.html

I think one of the key challenges will be incremental updates. 
You could perhaps afford to reparse entire source files on each 
keystroke, assuming DCT runs on a PC*, but you don't want to 
repeat the whole semantic analysis of several modules on every 
keystroke. (*although, in all seriousness, I hope someday to 
browse/write code in a smartphone/tablet IDE, without killing 
battery life)

D in particular makes standard IDE features difficult, if the 
code uses a lot of CTFE just to decide the meaning of the code, 
e.g. "static if" computes 1_000_000 digits of PI and decides 
whether to declare method "foo" or method "bar" based on whether 
the last digit is odd or even.

Of course, code does not normally waste the compiler's time 
deliberately, but these sorts of things can easily crop up 
accidentally. So DCT could profile its own operation and report 
to the user which analyses and functions are taking the longest 
to run.

Ideally, somebody would design an algorithm that, given a 
location where the syntax tree has changed, figures out what 
parts of the code are impacted by that change and only re-runs 
semantic analysis on the code whose meaning has potentially 
changed.

But, maybe that is too just hard. A simple approach would be to 
just re-analyze the whole damn program, but prioritize analysis 
so that whatever code the user is looking at is re-analyzed 
first. This could be enhanced by a simple-minded dependency tree, 
so that changing module X does not trigger reinterpretation of 
module Y if Y does not directly or indirectly use X at all.

By using multiple threads to analyze, any long computations 
wouldn't prevent analysis of the "easy parts"; but several 
threads could get stuck waiting on the same thing. For example, 
it would seem to me that if a module X contains a slow "static 
if" at module scope, ANY other module that imports X cannot 
resolve ANY unqualified function calls until that "static if" is 
done processing, because the contents of the "static if" MIGHT 
create new overloads that have to be considered*. So, when a 
thread gets stuck, it needs to be able to look for other work to 
do instead.

In any case, since D is turing-complete and CTFE may enter 
infinite loops (or just very long loops), an IDE will need to 
occasionally terminate threads and restart analysis, so the 
analysis threads must be killable, but hopefully it could be 
designed so that analysis doesn't have to restart from scratch.

I guess immutable data structures will therefore be quite 
important in the design, which you seem to be aware of already.

Jul 25 2012

Jacob Carlborg <doob me.com> writes:

On 2012-07-26 01:46, David Piepgrass wrote:

 I think one of the key challenges will be incremental updates. You could
 perhaps afford to reparse entire source files on each keystroke,
 assuming DCT runs on a PC*, but you don't want to repeat the whole
 semantic analysis of several modules on every keystroke. (*although, in
 all seriousness, I hope someday to browse/write code in a
 smartphone/tablet IDE, without killing battery life)

It would be nice if not even lexing or parsing needed to be done on the 
whole file.

-- 
/Jacob Carlborg

Jul 26 2012

"David Piepgrass" <qwertie256 gmail.com> writes:

On Wednesday, 23 May 2012 at 15:36:59 UTC, Roman D. Boiko wrote:
 On Tuesday, 22 May 2012 at 18:33:38 UTC, Roman D. Boiko wrote:
 I'm reviewing text right now

 Posted an updated version, but it is still a draft:

 http://d-coding.com/2012/05/23/dct-use-cases-revised.html

BTW, have you seen the video by Bret Victor entitled "Inventing 
on Principle"? This should be a use case for DCT:

http://vimeo.com/36579366

The most important part for the average (nongraphical) developer 
is his demo of writing a binary search algorithm. It may be 
difficult to use an ordinary debugger to debug CTFE, template 
overload resolution and "static if" statements, but something 
like Bret's demo, or what the Light Table IDE is supposed to do...

http://www.kickstarter.com/projects/ibdknox/light-table

...would be perfect for compile-time debugging, and not only 
that, it would also help people write their code in the first 
place, including (obviously) code intended for run-time.

P.S. oh how nice it would be if we could convince anyone to pay 
us to develop these compiler tools... just minimum wage would be 
soooo nice.

Jul 25 2012

Jacob Carlborg <doob me.com> writes:

On 2012-07-26 02:03, David Piepgrass wrote:

 BTW, have you seen the video by Bret Victor entitled "Inventing on
 Principle"? This should be a use case for DCT:

 http://vimeo.com/36579366

 The most important part for the average (nongraphical) developer is his
 demo of writing a binary search algorithm. It may be difficult to use an
 ordinary debugger to debug CTFE, template overload resolution and
 "static if" statements, but something like Bret's demo, or what the
 Light Table IDE is supposed to do...

 http://www.kickstarter.com/projects/ibdknox/light-table

 ...would be perfect for compile-time debugging, and not only that, it
 would also help people write their code in the first place, including
 (obviously) code intended for run-time.

 P.S. oh how nice it would be if we could convince anyone to pay us to
 develop these compiler tools... just minimum wage would be soooo nice.

That is so cool :)

-- 
/Jacob Carlborg

Jul 26 2012

D Programming

C/C++ Programming

Other

digitalmars.D - DCT use cases - draft