Commit d6ccffc0 authored by Leandro Melo's avatar Leandro Melo

C++: Core changes in preprocessing

Summary of most relevant items:

- Preprocessor output format change. No more gen true/false. Instead
  a more intuitive and natural expansion (like from a real compiler) is
  performed directly corresponding to the macro invocation. Notice that
  information about the generated tokens is not lost, because it's now
  embedded in the expansion section header (in terms of lines and columns
  as explained in the code). In addition the location on where the macro
  expansion happens is also documented for future use.

- Fix line control directives and associated token line numbers.
  This was not detected in tests cases because some of them were
  actually wrong: Within expansions the line information was being
  considered as originally computed in the macro definition, while
  the desired and expected for Creator's reporting mechanism (just
  like regular compilers) is the line from the expanded version
  of the tokens.

- Do not allow for eager expansion. This was previously being done
  inside define directives. However, it's not allowed and might
  lead to incorrect results, since the argument substitution should
  only happen upon the macro invocation (and following nested ones).
  At least GCC and clang are consistent with that. See test case
  tst_Preprocessor:dont_eagerly_expand for a detailed explanation.

- Revive the 'expanded' token flag. This is used to mark every token
  that originates from a macro expansion. Notice, however, that
  expanded tokens are not necessarily generated tokens (although
  every generated token is a expanded token). Expanded tokens that
  are not generated are those which are still considered by our
  code model features, since they are visible on the editor. The
  translation unit is smart enough to calculate line/column position
  for such tokens based on the information from the expansion section
  header.

- How expansions are tracked has also changed. Now, we simply add
  two surrounding marker tokens to each "top-level" expansion
  sequence. There is an enumeration that control expansion states.
  Also, no "previous" token is kept around.

- Preprocessor client methods suffered a change in signature so
  they now receive the line number of the action in question as
  a paramater. Previously such line could be retrieved by the client
  implementation by accessing the environment line. However, this
  is not reliable because we try to avoid synchronization of the
  output/environment lines in order to avoid unnecessary output,
  while expanding macros or handling preprocessor directives.

- Although macros are not expanded during define directives (as
  mentioned above) the preprocessor client is now "notified"
  when it sees a macro. This is to allow usage tracking.

- Other small stuff.

This is all in one patch because the fixes are a consequence
of the change in preprocessing control.

Change-Id: I8f4c6e6366f37756ec65d0a93b79f72a3ac4ed50
Reviewed-by: default avatarRoberto Raggi <roberto.raggi@nokia.com>
parent e99c1393
......@@ -321,13 +321,29 @@ public:
public:
struct Flags {
// The token kind.
unsigned kind : 8;
// The token starts a new line.
unsigned newline : 1;
// The token is preceeded by whitespace(s).
unsigned whitespace : 1;
// The token is joined with the previous one.
unsigned joined : 1;
// The token originates from a macro expansion.
unsigned expanded : 1;
// The token originates from a macro expansion and does not correspond to an
// argument that went through substitution. Notice the example:
//
// #define FOO(a, b) a + b;
// FOO(1, 2)
//
// After preprocessing we would expect the following tokens: 1 + 2;
// Tokens '1', '+', '2', and ';' are all expanded. However only tokens '+' and ';'
// are generated.
unsigned generated : 1;
// Unused...
unsigned pad : 3;
// The token lenght.
unsigned length : 16;
};
union {
......
......@@ -27,8 +27,10 @@
#include "Literals.h"
#include "DiagnosticClient.h"
#include <stack>
#include <vector>
#include <cstdarg>
#include <algorithm>
#include <utility>
#ifdef _MSC_VER
# define va_copy(dst, src) ((dst) = (src))
......@@ -176,27 +178,84 @@ void TranslationUnit::tokenize()
pushPreprocessorLine(0, 1, fileId());
const Identifier *lineId = control()->identifier("line");
const Identifier *genId = control()->identifier("gen");
const Identifier *expansionId = control()->identifier("expansion");
const Identifier *beginId = control()->identifier("begin");
const Identifier *endId = control()->identifier("end");
// We need to track information about the expanded tokens. A vector with an addition
// explicit index control is used instead of queue mainly for performance reasons.
std::vector<std::pair<unsigned, unsigned> > lineColumn;
unsigned lineColumnIdx = 0;
bool generated = false;
Token tk;
do {
lex(&tk);
_Lrecognize:
_Lrecognize:
if (tk.is(T_POUND) && tk.newline()) {
unsigned offset = tk.offset;
lex(&tk);
if (! tk.f.newline && tk.is(T_IDENTIFIER) && tk.identifier == genId) {
// it's a gen directive.
if (! tk.f.newline && tk.is(T_IDENTIFIER) && tk.identifier == expansionId) {
// It's an expansion mark.
lex(&tk);
if (! tk.f.newline && tk.is(T_TRUE)) {
lex(&tk);
generated = true;
} else {
generated = false;
if (!tk.f.newline && tk.is(T_IDENTIFIER)) {
if (tk.identifier == beginId) {
// Start of a macro expansion section.
lex(&tk);
// Gather where the expansion happens and its length.
unsigned macroOffset = static_cast<unsigned>(strtoul(tk.spell(), 0, 0));
lex(&tk);
lex(&tk); // Skip the separating comma
unsigned macroLength = static_cast<unsigned>(strtoul(tk.spell(), 0, 0));
lex(&tk);
// NOTE: We are currently not using the macro offset and length. They
// are kept here for now because of future use.
Q_UNUSED(macroOffset)
Q_UNUSED(macroLength)
// Now we need to gather the real line and columns from the upcoming
// tokens. But notice this is only relevant for tokens which are expanded
// but not generated.
while (tk.isNot(T_EOF_SYMBOL) && !tk.f.newline) {
// When we get a ~ it means there's a number of generated tokens
// following. Otherwise, we have actual data.
if (tk.is(T_TILDE)) {
lex(&tk);
// Get the total number of generated tokens and specifiy "null"
// information for them.
unsigned totalGenerated =
static_cast<unsigned>(strtoul(tk.spell(), 0, 0));
const std::size_t previousSize = lineColumn.size();
lineColumn.resize(previousSize + totalGenerated);
std::fill(lineColumn.begin() + previousSize,
lineColumn.end(),
std::make_pair(0, 0));
lex(&tk);
} else if (tk.is(T_NUMERIC_LITERAL)) {
unsigned line = static_cast<unsigned>(strtoul(tk.spell(), 0, 0));
lex(&tk);
lex(&tk); // Skip the separating colon
unsigned column = static_cast<unsigned>(strtoul(tk.spell(), 0, 0));
// Store line and column for this non-generated token.
lineColumn.push_back(std::make_pair(line, column));
lex(&tk);
}
}
} else if (tk.identifier == endId) {
// End of a macro expansion.
lineColumn.clear();
lineColumnIdx = 0;
lex(&tk);
}
}
} else {
if (! tk.f.newline && tk.is(T_IDENTIFIER) && tk.identifier == lineId)
......@@ -211,9 +270,9 @@ void TranslationUnit::tokenize()
lex(&tk);
}
}
while (tk.isNot(T_EOF_SYMBOL) && ! tk.f.newline)
lex(&tk);
}
while (tk.isNot(T_EOF_SYMBOL) && ! tk.f.newline)
lex(&tk);
goto _Lrecognize;
} else if (tk.f.kind == T_LBRACE) {
braces.push(_tokens->size());
......@@ -225,7 +284,24 @@ void TranslationUnit::tokenize()
_comments->push_back(tk);
continue; // comments are not in the regular token stream
}
tk.f.generated = generated;
bool currentExpanded = false;
bool currentGenerated = false;
if (!lineColumn.empty() && lineColumnIdx < lineColumn.size()) {
currentExpanded = true;
const std::pair<unsigned, unsigned> &p = lineColumn[lineColumnIdx];
if (p.first)
_expandedLineColumn.insert(std::make_pair(tk.offset, p));
else
currentGenerated = true;
++lineColumnIdx;
}
tk.f.expanded = currentExpanded;
tk.f.generated = currentGenerated;
_tokens->push_back(tk);
} while (tk.f.kind);
......@@ -355,12 +431,32 @@ void TranslationUnit::getPosition(unsigned tokenOffset,
unsigned *column,
const StringLiteral **fileName) const
{
unsigned lineNumber = findLineNumber(tokenOffset);
unsigned columnNumber = findColumnNumber(tokenOffset, lineNumber);
const PPLine ppLine = findPreprocessorLine(tokenOffset);
unsigned lineNumber = 0;
unsigned columnNumber = 0;
const StringLiteral *file = 0;
// If this token is expanded we already have the information directly from the expansion
// section header. Otherwise, we need to calculate it.
std::map<unsigned, std::pair<unsigned, unsigned> >::const_iterator it =
_expandedLineColumn.find(tokenOffset);
if (it != _expandedLineColumn.end()) {
lineNumber = it->second.first;
columnNumber = it->second.second + 1;
file = _fileId;
} else {
// Identify line within the entire translation unit.
lineNumber = findLineNumber(tokenOffset);
// Identify column.
columnNumber = findColumnNumber(tokenOffset, lineNumber);
lineNumber -= findLineNumber(ppLine.offset) + 1;
lineNumber += ppLine.line;
// Adjust the line in regards to the preprocessing markers.
const PPLine ppLine = findPreprocessorLine(tokenOffset);
lineNumber -= findLineNumber(ppLine.offset) + 1;
lineNumber += ppLine.line;
file = ppLine.fileName;
}
if (line)
*line = lineNumber;
......@@ -369,7 +465,7 @@ void TranslationUnit::getPosition(unsigned tokenOffset,
*column = columnNumber;
if (fileName)
*fileName = ppLine.fileName;
*fileName = file;
}
bool TranslationUnit::blockErrors(bool block)
......
......@@ -27,7 +27,7 @@
#include "DiagnosticClient.h"
#include <cstdio>
#include <vector>
#include <map>
namespace CPlusPlus {
......@@ -170,6 +170,7 @@ private:
std::vector<Token> *_comments;
std::vector<unsigned> _lineOffsets;
std::vector<PPLine> _ppLines;
std::map<unsigned, std::pair<unsigned, unsigned> > _expandedLineColumn; // TODO: Replace this for a hash
MemoryPool *_pool;
AST *_ast;
TranslationUnit *_previousTranslationUnit;
......
......@@ -60,7 +60,7 @@ QByteArray FastPreprocessor::run(QString fileName, const QString &source)
return preprocessed;
}
void FastPreprocessor::sourceNeeded(QString &fileName, IncludeType, unsigned)
void FastPreprocessor::sourceNeeded(unsigned, QString &fileName, IncludeType)
{ mergeEnvironment(fileName); }
void FastPreprocessor::mergeEnvironment(const QString &fileName)
......
......@@ -59,18 +59,19 @@ public:
QByteArray run(QString fileName, const QString &source);
// CPlusPlus::Client
virtual void sourceNeeded(QString &fileName, IncludeType, unsigned);
virtual void sourceNeeded(unsigned, QString &fileName, IncludeType);
virtual void macroAdded(const Macro &) {}
virtual void passedMacroDefinitionCheck(unsigned, const Macro &) {}
virtual void passedMacroDefinitionCheck(unsigned, unsigned, const Macro &) {}
virtual void failedMacroDefinitionCheck(unsigned, const ByteArrayRef &) {}
virtual void notifyMacroReference(unsigned, unsigned, const Macro &) {}
virtual void startExpandingMacro(unsigned,
unsigned,
const Macro &,
const ByteArrayRef &,
const QVector<MacroArgumentReference> &) {}
virtual void stopExpandingMacro(unsigned, const Macro &) {}
virtual void startSkippingBlocks(unsigned) {}
......
......@@ -23,10 +23,10 @@ int ByteArrayRef::count(char ch) const
return num;
}
void Internal::PPToken::squeeze()
void Internal::PPToken::squeezeSource()
{
if (isValid()) {
m_src = m_src.mid(offset, length());
if (hasSource()) {
m_src = m_src.mid(offset, f.length);
m_src.squeeze();
offset = 0;
}
......
......@@ -96,6 +96,11 @@ public:
const QByteArray &source() const
{ return m_src; }
bool hasSource() const
{ return !m_src.isEmpty(); }
void squeezeSource();
const char *bufferStart() const
{ return m_src.constData(); }
......@@ -105,11 +110,6 @@ public:
ByteArrayRef asByteArrayRef() const
{ return ByteArrayRef(&m_src, offset, length()); }
bool isValid() const
{ return !m_src.isEmpty(); }
void squeeze();
private:
QByteArray m_src;
};
......
......@@ -80,24 +80,23 @@ public:
virtual void macroAdded(const Macro &macro) = 0;
virtual void passedMacroDefinitionCheck(unsigned offset, const Macro &macro) = 0;
virtual void passedMacroDefinitionCheck(unsigned offset, unsigned line, const Macro &macro) = 0;
virtual void failedMacroDefinitionCheck(unsigned offset, const ByteArrayRef &name) = 0;
virtual void notifyMacroReference(unsigned offset, unsigned line, const Macro &macro) = 0;
virtual void startExpandingMacro(unsigned offset,
unsigned line,
const Macro &macro,
const ByteArrayRef &originalText,
const QVector<MacroArgumentReference> &actuals
= QVector<MacroArgumentReference>()) = 0;
virtual void stopExpandingMacro(unsigned offset,
const Macro &macro) = 0;
virtual void stopExpandingMacro(unsigned offset, const Macro &macro) = 0;
/// Start skipping from the given offset.
virtual void startSkippingBlocks(unsigned offset) = 0;
virtual void stopSkippingBlocks(unsigned offset) = 0;
virtual void sourceNeeded(QString &fileName, IncludeType mode,
unsigned line) = 0; // ### FIX the signature.
virtual void sourceNeeded(unsigned line, QString &fileName, IncludeType mode) = 0;
};
} // namespace CPlusPlus
......
......@@ -95,6 +95,7 @@ private:
public:
QString currentFile;
QByteArray currentFileUtf8;
unsigned currentLine;
bool hideNext;
......
This diff is collapsed.
......@@ -60,6 +60,7 @@
#include <QVector>
#include <QBitArray>
#include <QByteArray>
#include <QPair>
namespace CPlusPlus {
......@@ -92,10 +93,17 @@ private:
void preprocess(const QString &filename,
const QByteArray &source,
QByteArray *result, bool noLines, bool markGeneratedTokens, bool inCondition,
unsigned offsetRef = 0, unsigned envLineRef = 1);
unsigned offsetRef = 0, unsigned lineRef = 1);
enum { MAX_LEVEL = 512 };
enum ExpansionStatus {
NotExpanding,
ReadyForExpansion,
Expanding,
JustFinishedExpansion
};
struct State {
State();
......@@ -114,14 +122,17 @@ private:
bool m_inPreprocessorDirective;
QByteArray *m_result;
bool m_markGeneratedTokens;
bool m_markExpandedTokens;
bool m_noLines;
bool m_inCondition;
bool m_inDefine;
unsigned m_offsetRef;
unsigned m_envLineRef;
unsigned m_lineRef;
ExpansionStatus m_expansionStatus;
QByteArray m_expansionResult;
QVector<QPair<unsigned, unsigned> > m_expandedTokensInfo;
};
void handleDefined(PPToken *tk);
......@@ -129,9 +140,11 @@ private:
void lex(PPToken *tk);
void skipPreprocesorDirective(PPToken *tk);
bool handleIdentifier(PPToken *tk);
bool handleFunctionLikeMacro(PPToken *tk, const Macro *macro, QVector<PPToken> &body,
bool addWhitespaceMarker,
const QVector<QVector<PPToken> > &actuals);
bool handleFunctionLikeMacro(PPToken *tk,
const Macro *macro,
QVector<PPToken> &body,
const QVector<QVector<PPToken> > &actuals,
unsigned lineRef);
bool skipping() const
{ return m_state.m_skipping[m_state.m_ifLevel]; }
......@@ -155,30 +168,28 @@ private:
static bool isQtReservedWord(const ByteArrayRef &name);
inline bool atStartOfOutputLine() const
{ return (m_state.m_result && !m_state.m_result->isEmpty()) ? m_state.m_result->end()[-1] == '\n' : true; }
inline void startNewOutputLine() const
{
if (m_state.m_result && !m_state.m_result->isEmpty() && m_state.m_result->end()[-1] != '\n')
out('\n');
}
void genLine(unsigned lineno, const QByteArray &fileName) const;
inline void out(const QByteArray &text) const
{ if (m_state.m_result) m_state.m_result->append(text); }
void trackExpansionCycles(PPToken *tk);
inline void out(char ch) const
{ if (m_state.m_result) m_state.m_result->append(ch); }
template <class T>
void writeOutput(const T &t);
void writeOutput(const ByteArrayRef &ref);
bool atStartOfOutputLine() const;
void maybeStartOutputLine();
void generateOutputLineMarker(unsigned lineno);
void synchronizeOutputLines(const PPToken &tk, bool forceLine = false);
void removeTrailingOutputLines();
inline void out(const char *s) const
{ if (m_state.m_result) m_state.m_result->append(s); }
const QByteArray *currentOutputBuffer() const;
QByteArray *currentOutputBuffer();
inline void out(const ByteArrayRef &ref) const
{ if (m_state.m_result) m_state.m_result->append(ref.start(), ref.length()); }
void enforceSpacing(const PPToken &tk, bool forceSpacing = false);
static std::size_t computeDistance(const PPToken &tk, bool forceTillLine = false);
PPToken generateToken(enum Kind kind, const char *content, int len, unsigned lineno, bool addQuotes);
PPToken generateToken(enum Kind kind,
const char *content, int length,
unsigned lineno,
bool addQuotes,
bool addToControl = true);
PPToken generateConcatenated(const PPToken &leftTk, const PPToken &rightTk);
void startSkippingBlocks(const PPToken &tk) const;
......
......@@ -340,7 +340,7 @@ public:
void CppPreprocessor::run(const QString &fileName)
{
QString absoluteFilePath = fileName;
sourceNeeded(absoluteFilePath, IncludeGlobal, /*line = */ 0);
sourceNeeded(0, absoluteFilePath, IncludeGlobal);
}
void CppPreprocessor::resetEnvironment()
......@@ -499,12 +499,12 @@ void CppPreprocessor::macroAdded(const Macro &macro)
m_currentDoc->appendMacro(macro);
}
void CppPreprocessor::passedMacroDefinitionCheck(unsigned offset, const Macro &macro)
void CppPreprocessor::passedMacroDefinitionCheck(unsigned offset, unsigned line, const Macro &macro)
{
if (! m_currentDoc)
return;
m_currentDoc->addMacroUse(macro, offset, macro.name().length(), env.currentLine,
m_currentDoc->addMacroUse(macro, offset, macro.name().length(), line,
QVector<MacroArgumentReference>());
}
......@@ -516,15 +516,23 @@ void CppPreprocessor::failedMacroDefinitionCheck(unsigned offset, const ByteArra
m_currentDoc->addUndefinedMacroUse(QByteArray(name.start(), name.size()), offset);
}
void CppPreprocessor::startExpandingMacro(unsigned offset,
void CppPreprocessor::notifyMacroReference(unsigned offset, unsigned line, const Macro &macro)
{
if (! m_currentDoc)
return;
m_currentDoc->addMacroUse(macro, offset, macro.name().length(), line,
QVector<MacroArgumentReference>());
}
void CppPreprocessor::startExpandingMacro(unsigned offset, unsigned line,
const Macro &macro,
const ByteArrayRef &originalText,
const QVector<MacroArgumentReference> &actuals)
{
if (! m_currentDoc)
return;
m_currentDoc->addMacroUse(macro, offset, originalText.length(), env.currentLine, actuals);
m_currentDoc->addMacroUse(macro, offset, macro.name().length(), line, actuals);
}
void CppPreprocessor::stopExpandingMacro(unsigned, const Macro &)
......@@ -573,7 +581,7 @@ void CppPreprocessor::stopSkippingBlocks(unsigned offset)
m_currentDoc->stopSkippingBlocks(offset);
}
void CppPreprocessor::sourceNeeded(QString &fileName, IncludeType type, unsigned line)
void CppPreprocessor::sourceNeeded(unsigned line, QString &fileName, IncludeType type)
{
if (fileName.isEmpty())
return;
......@@ -590,7 +598,7 @@ void CppPreprocessor::sourceNeeded(QString &fileName, IncludeType type, unsigned
Document::DiagnosticMessage d(Document::DiagnosticMessage::Warning,
m_currentDoc->fileName(),
env.currentLine, /*column = */ 0,
line, /*column = */ 0,
msg);
m_currentDoc->addDiagnosticMessage(d);
......
......@@ -300,17 +300,19 @@ protected:
void mergeEnvironment(CPlusPlus::Document::Ptr doc);
virtual void macroAdded(const CPlusPlus::Macro &macro);
virtual void passedMacroDefinitionCheck(unsigned offset, const CPlusPlus::Macro &macro);
virtual void passedMacroDefinitionCheck(unsigned offset, unsigned line,
const CPlusPlus::Macro &macro);
virtual void failedMacroDefinitionCheck(unsigned offset, const CPlusPlus::ByteArrayRef &name);
virtual void notifyMacroReference(unsigned offset, unsigned line,
const CPlusPlus::Macro &macro);
virtual void startExpandingMacro(unsigned offset,
unsigned line,
const CPlusPlus::Macro &macro,
const CPlusPlus::ByteArrayRef &originalText,
const QVector<CPlusPlus::MacroArgumentReference> &actuals);
virtual void stopExpandingMacro(unsigned offset, const CPlusPlus::Macro &macro);
virtual void startSkippingBlocks(unsigned offset);
virtual void stopSkippingBlocks(unsigned offset);
virtual void sourceNeeded(QString &fileName, IncludeType type,
unsigned line);
virtual void sourceNeeded(unsigned line, QString &fileName, IncludeType type);
private:
#ifndef ICHECK_BUILD
......
# 1 "data/empty-macro.2.cpp"
# 6 "data/empty-macro.2.cpp"
class Test {
private:
Test
#gen true
# 3 "data/empty-macro.2.cpp"
(const
#gen false
# 8 "data/empty-macro.2.cpp"
Test
#gen true
# 3 "data/empty-macro.2.cpp"
&);
#gen false
# 8 "data/empty-macro.2.cpp"
Test
#gen true
# 4 "data/empty-macro.2.cpp"
&operator=(const